Abstract
This material relies on John C. Hull (Hull 2015) credit risk chapters and Lore Dirick credit risk DataCamp course. Some mathematical background is skipped to emphasize the data analysis, model logic, discussion, graphical approach and R coding (R Core Team 2021). As in the philosophy of Donald Knuth (Knuth 1984), the objective of this document is to explain to human beings what we want a computer to do as literate programming. This is a work in progress and it is under revision.# Logistic models
library(gmodels) # CrossTable()
library(ggplot2)
library(tidyr) # gather()
library(dplyr)
library(pROC) # roc
This section relies on the DataCamp course Credit Risk Modeling in R by Lore Dirick. However, we incorporate a slightly different database and an extended analysis.
Let’s load the data called loan_data_ARF.rds and then understand its structure before conducting any further analysis. This database is available here.
dat <- readRDS("loan_data_ARF.rds")
str(dat)
## 'data.frame': 29092 obs. of 10 variables:
## $ loan_status : int 0 0 0 0 0 0 1 0 1 0 ...
## $ loan_amnt : int 5000 2400 10000 5000 3000 12000 9000 3000 10000 1000 ...
## $ int_rate : num 10.6 11 13.5 11 11 ...
## $ grade : Factor w/ 7 levels "A","B","C","D",..: 2 3 3 1 5 2 3 2 2 4 ...
## $ emp_length : int 10 25 13 3 9 11 0 3 3 0 ...
## $ home_ownership: Factor w/ 4 levels "MORTGAGE","OTHER",..: 4 4 4 4 4 3 4 4 4 4 ...
## $ annual_inc : num 24000 12252 49200 36000 48000 ...
## $ age : int 33 31 24 39 24 28 22 22 28 22 ...
## $ sex : Factor w/ 2 levels "0","1": 1 1 1 1 2 2 2 2 1 2 ...
## $ region : Factor w/ 4 levels "E","N","S","W": 1 1 3 3 2 2 2 2 4 1 ...
This could be a typical database taken from any given financial institution like a bank or a firm that uses credit channels to sell their products or services. Here, we have 29,092 observations of 10 variables. Each observation corresponds to the personal and loan characteristics of one loan. An important variable, our dependent variable, is loan_st the value of 0 is no default and the value of 1 is default. A default occurs when a borrower is unable to make timely payments, misses payments, or avoids or stops making payments on interest or principal owed. Then, the definition of default depends on the interests and objectives of the analysis, here we simply classify between default or no default. The variable loan_st is dichotomous or categorical. Here, we are interested to predict whether a new application will default or not.
Clearly, loan_data_ARF.rds is past information as we know with certainty whether the individual defaulted (1) or not (0). Past information is helpful to better understand how likely is that one individual may default according to their personal and loan characteristics. Past information is useful to train our quantitative models to make predictions of new applicants, and even evaluate the performance of our classifications.
A variable name is too long when there exists a shorter name that equally conveys the purpose of the variable. Let’s rename some of them.
old_names <- colnames(dat)
colnames(dat) <- c("loan_st", "l_amnt", "int", "grade", "emp_len",
"home", "income", "age", "sex", "region")
data.frame(old_names, "new_names" = colnames(dat))
## old_names new_names
## 1 loan_status loan_st
## 2 loan_amnt l_amnt
## 3 int_rate int
## 4 grade grade
## 5 emp_length emp_len
## 6 home_ownership home
## 7 annual_inc income
## 8 age age
## 9 sex sex
## 10 region region
We can take a look at the information in different ways. For example, look at the first 10 rows out of 29,092.
head(dat, 10)
## loan_st l_amnt int grade emp_len home income age sex region
## 1 0 5000 10.65 B 10 RENT 24000 33 0 E
## 2 0 2400 10.99 C 25 RENT 12252 31 0 E
## 3 0 10000 13.49 C 13 RENT 49200 24 0 S
## 4 0 5000 10.99 A 3 RENT 36000 39 0 S
## 5 0 3000 10.99 E 9 RENT 48000 24 1 N
## 6 0 12000 12.69 B 11 OWN 75000 28 1 N
## 7 1 9000 13.49 C 0 RENT 30000 22 1 N
## 8 0 3000 9.91 B 3 RENT 15000 22 1 N
## 9 1 10000 10.65 B 3 RENT 100000 28 0 W
## 10 0 1000 16.29 D 0 RENT 28000 22 1 E
Note that sex is 1 for female and 0 for male. Now, instead of looking the details of the first 10 rows, we can summarize with respect to home with the CrossTable() function.
CrossTable(dat$home)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 29092
##
##
## | MORTGAGE | OTHER | OWN | RENT |
## |-----------|-----------|-----------|-----------|
## | 12002 | 97 | 2301 | 14692 |
## | 0.413 | 0.003 | 0.079 | 0.505 |
## |-----------|-----------|-----------|-----------|
##
##
##
##
We can also use CrossTable() to summarize two variables. In particular, instead of counting for home ownership we can add a second dimension like loan_st. This allows us to create more informative tables.
CrossTable(dat$home, dat$loan_st, prop.r = TRUE,
prop.c = FALSE, prop.t = FALSE, prop.chisq = FALSE)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Row Total |
## |-------------------------|
##
##
## Total Observations in Table: 29092
##
##
## | dat$loan_st
## dat$home | 0 | 1 | Row Total |
## -------------|-----------|-----------|-----------|
## MORTGAGE | 10821 | 1181 | 12002 |
## | 0.902 | 0.098 | 0.413 |
## -------------|-----------|-----------|-----------|
## OTHER | 80 | 17 | 97 |
## | 0.825 | 0.175 | 0.003 |
## -------------|-----------|-----------|-----------|
## OWN | 2049 | 252 | 2301 |
## | 0.890 | 0.110 | 0.079 |
## -------------|-----------|-----------|-----------|
## RENT | 12915 | 1777 | 14692 |
## | 0.879 | 0.121 | 0.505 |
## -------------|-----------|-----------|-----------|
## Column Total | 25865 | 3227 | 29092 |
## -------------|-----------|-----------|-----------|
##
##
This table reveals defaults by home ownership. We can use histograms to see one variable distribution. In this case we have the interest rate distribution.
ggplot(dat, aes(x = int)) +
geom_histogram(aes(y=..density..), binwidth = 0.5, colour = "black",
fill = "white") +
labs(y = "Density", x = "Interest rate") +
theme(legend.position = "bottom", legend.title = element_blank())
Figure 1.1: Interest rate histogram.
The following is a boxplot figure for the annual income.
ggplot(dat, aes(income)) +
geom_boxplot() +
labs(y = "Density",
x = "Annual income") +
theme(legend.position = "bottom", legend.title = element_blank())
Figure 1.2: Annual income boxplot.
The plot above looks suspicious. We have a very large value of the annual income in the horizontal axis (6,000,000). Also, there are few individuals with a very high income. We should explore further and investigate if these are valid observations or simply a mistake in the original database.
high_income <- dat[(dat$income > 1000000), ]
high_income
## loan_st l_amnt int grade emp_len home income age sex region
## 4861 0 12025 14.27 C 13 RENT 1782000 63 0 E
## 13931 0 10000 6.54 A 16 OWN 1200000 36 0 W
## 15386 0 1500 10.99 A 5 MORTGAGE 1900000 60 1 N
## 16713 0 12000 7.51 A 1 MORTGAGE 1200000 32 0 E
## 19486 0 5000 12.73 C 12 MORTGAGE 6000000 144 1 E
## 22811 0 10000 10.99 A 1 MORTGAGE 1200000 40 1 E
## 23361 0 6400 7.40 A 7 MORTGAGE 1440000 44 1 E
## 23683 0 6600 7.74 A 9 MORTGAGE 1362000 47 0 E
## 28468 0 8450 12.29 C 0 RENT 2039784 42 0 E
One guy (19486) is not only rich, he is 144 years old. So, my decision is to drop these 9 observations. Cleaning data is a common task when dealing with large databases. This is fine as long as we do not alter the nature of the data.
high_income_index <- data.frame(value = as.integer(rownames(high_income)))
dat <- dat[-high_income_index$value,]
Remember the database has originally 29,092 rows and now we are dropping 9 so we end up with 29,083. See the result.
ggplot(dat, aes(income)) +
geom_boxplot() +
labs(y = "Density", x = "Annual income") +
theme(legend.position = "bottom", legend.title = element_blank())
Figure 1.3: Annual income without extreme values.
Somewhat better now.
Logistic models allows us to make predictions about loan defaults. Logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable like loan_st. In this case, the binary dependent variable is default (1) or no default (0). A good reference for this section is Hull (2020).
First, load the data and split it into two sets: (1) training and (2) test. The training set is for building and estimate models, and the test set is used to evaluate our model predictions with new data. When estimating models, it is common practice to separate the available data into two parts, training and test data, where the training data is used to estimate parameters (in-sample) and the test data is used to evaluate its accuracy (out of sample). Because the test data is not used in determining the estimation, it should provide a reliable indication of how well the model is likely to estimate or forecast on new data. In sum, we train the model, we test the model, and once we are OK with the model performance on new data, we are ready to use it in real-life applications. If we ignore this split and use the whole database to estimate our models, we may succeed at explaining defaults in our database but we may fail to explain defaults for new loan applications.
# It is convenient to set the loan status as factor.
dat$loan_st <- as.factor(dat$loan_st)
set.seed(567)
index_train <- cbind(runif(1 : nrow(dat), 0 , 1), c(1 : nrow(dat)))
index_train <- order(index_train[, 1])
index_train <- index_train[1: (2/3 * nrow(dat))]
# Create training set
train <- dat[index_train, ]
# Create test set
test <- dat[-index_train, ]
We have 29,083 observations in dat. The code above randomly selects \(29083 \times (2/3)=19388\) rows to form the train. The test are the remaining \(29083-19388=9695\) rows. The random selection is highly recommended as the dat may have some structure or sorting that could bias our model estimation and negatively impact our model test. For example, imagine that for some weird reason the database is sorted in such a way that the first observations are all no default. If that is the case, then the training and the test set would not have portions of default and no default cases and we may distort the whole analysis. The random selection allows us to replicate a real situation in which our database is unsorted, with different characteristics.
Let’s see if the recent created train and test have the same basic properties than dat.
dat_prop <- table(dat$loan_st)/sum(table(dat$loan_st))
train_prop <- table(train$loan_st)/sum(table(train$loan_st))
test_prop <- table(test$loan_st)/sum(table(test$loan_st))
prop <- data.frame(rbind(dat_prop, train_prop, test_prop))
colnames(prop) <- c("no defaults", "defaults")
prop
## no defaults defaults
## dat_prop 0.8890417 0.1109583
## train_prop 0.8882298 0.1117702
## test_prop 0.8906653 0.1093347
Take a look of the training set.
# See the data structure.
head(train)
## loan_st l_amnt int grade emp_len home income age sex region
## 21547 0 4000 10.25 B 6 RENT 26000 27 1 N
## 22625 0 5000 14.26 C 3 RENT 280000 36 1 N
## 20340 0 18000 18.30 F 10 RENT 121000 24 1 N
## 1911 0 5600 7.90 A 3 RENT 32000 22 0 W
## 4021 0 2700 14.27 C 5 MORTGAGE 88500 32 1 N
## 16887 0 15000 10.38 B 10 MORTGAGE 75000 23 0 W
Variables as factors are useful for model estimation and data visualization. Factors are variables in R which take on a limited number of different values; such variables are often referred to as categorical variables.
Assume we think that the loan_st depends on the age of the individual. We can estimate a simple logistic model to learn about the relationship between age and loan status.
# Fitting a simple logistic model.
logi_age <- glm(loan_st ~ age, family = "binomial", data = train)
logi_age
##
## Call: glm(formula = loan_st ~ age, family = "binomial", data = train)
##
## Coefficients:
## (Intercept) age
## -1.90097 -0.00623
##
## Degrees of Freedom: 19387 Total (i.e. Null); 19386 Residual
## Null Deviance: 13580
## Residual Deviance: 13580 AIC: 13580
Apparently, there is a negative relationship between age and loan status. The AIC value (13,580) is useful when comparing models. The Akaike information criterion (AIC) is a mathematical method for evaluating how well a model fits the data it was generated from. In statistics, AIC is used to compare different possible models and determine which one is the best fit for the data. At the moment we cannot interpret the AIC simply because we only have one model and we cannot compare it with another AIC.
Let’s estimate another simple model where the interest rate category is used as a predictor of the loan_st. Remember we are not conducting any prediction at all at this moment, we are only estimating models using the training set.
# Build a glm model with variable interest rate as a predictor.
logi_int <- glm(formula = loan_st ~ int, family = "binomial", data = train)
# Print the parameter estimates.
logi_int
##
## Call: glm(formula = loan_st ~ int, family = "binomial", data = train)
##
## Coefficients:
## (Intercept) int
## -3.710 0.142
##
## Degrees of Freedom: 19387 Total (i.e. Null); 19386 Residual
## Null Deviance: 13580
## Residual Deviance: 13210 AIC: 13220
The AIC is a lower (13,220 versus 13,580), so we have a better model now.
Using one single predictor as age or interest rate is clearly a limited approach. Let’s add some more predictors. Also, let’s introduce the summary() function to extract more information about the model estimation results. The logi_multi below assumes that the loan status depend on the age, interest rate, grade, loan amount, and annual income.
# Multiple variables in a logistic regression model.
logi_multi <- glm(loan_st ~ age + int + grade + log(l_amnt) +
log(income) , family = "binomial", data = train)
# Obtain significance levels using summary().
summary(logi_multi)
##
## Call:
## glm(formula = loan_st ~ age + int + grade + log(l_amnt) + log(income),
## family = "binomial", data = train)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1545 -0.5351 -0.4397 -0.3382 2.6101
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.996240 0.477911 4.177 2.95e-05 ***
## age -0.002302 0.003825 -0.602 0.5473
## int 0.038767 0.017249 2.247 0.0246 *
## gradeB 0.503409 0.087435 5.758 8.54e-09 ***
## gradeC 0.748229 0.117765 6.354 2.10e-10 ***
## gradeD 0.964343 0.147283 6.548 5.85e-11 ***
## gradeE 1.033442 0.190817 5.416 6.10e-08 ***
## gradeF 1.619470 0.257900 6.279 3.40e-10 ***
## gradeG 1.867494 0.440232 4.242 2.21e-05 ***
## log(l_amnt) 0.015718 0.036341 0.433 0.6654
## log(income) -0.470748 0.046423 -10.140 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 13579 on 19387 degrees of freedom
## Residual deviance: 13028 on 19377 degrees of freedom
## AIC: 13050
##
## Number of Fisher Scoring iterations: 5
Our multi-factor model works well. In logi_multi, the AIC value is the lowest so far (13,050 versus 13,220), so this should be considered as the best in-sample model at the moment. The summary() function shows the significance levels of the estimators, but we are currently more interested in the goodness of fit of the models because we want to conduct predictions about the loan_st. This is, we are interested to use a model to find out whether new applicants in the test set are expected to default or not, rather than in the applicants’ credit risk factors. This is why we are concentrated in AIC now.
When a customer fill out a credit application form, we collect information but we do not know for sure whether she or he will eventually default. A credit risk model can help us in this task.
Let’s take our three models: logi_age, logi_int and logi_multi from the previous subsection to carry out a simple prediction exercise. We start by identifying one observation in the test set and ask the models to predict the loan_st. This is, we take the first guy age, then we apply the logi_age model, and compare the predicted loan_st with respect to what really happened. Remember we know what really happened with this guy because we have the information in the test set. Every model is expected to produce different loan_st predictions. If the model is good, then the predicted loan_st will match what really happened.
# Define one single observation in test_set.
John_Doe <- as.data.frame(test[1, ])
John_Doe
## loan_st l_amnt int grade emp_len home income age sex region
## 1 0 5000 10.65 B 10 RENT 24000 33 0 E
We know in advance that the loan_st of this observation taken from the test set is 0. However, the models cannot know this simply because we did not use the test set to estimate the logistic models. Our models were estimated using the training set. A good credit risk model should predict a no default given this new applicant.
The values of loan_st in the test set is either 0 or 1. However, the logistic models estimate the loan_st as values in the range of 0 to 1. This mean that we would expect the estimated loan_st to be close to 0. But, how close? We will deal with this issue later.
# Predict the loan status.
logi_age_pred <- predict(logi_age, newdata = John_Doe, type = "response")
logi_int_pred <- predict(logi_int, newdata = John_Doe, type = "response")
logi_multi_pred <- predict(logi_multi, newdata = John_Doe, type = "response")
# Collect all.
pred_John <- rbind("logi_age" = logi_age_pred,
"logi_int" = logi_int_pred,
"logi_multi" = logi_multi_pred)
# Prepare a table.
colnames(pred_John) <- "Loan status predictions for John Doe."
pred_John
## Loan status predictions for John Doe.
## logi_age 0.10846236
## logi_int 0.09996702
## logi_multi 0.14461840
These values are low as they are close to 0. We could interpret this as a certain ability of the models to predict this single case from the test set. However, several questions remains unanswered and requires further analysis. For example: How can we determine if the prediction is low enough to consider it as a non-default? We may need a cut-off value to decide. We will explore this issue later.
Another aspect is: What about the rest of the cases in the test set? We have 9,695 observations in the test set and in the example above we only test for the first one. We are interested in the entire test set, not only for John Doe. Fortunately, this issue is easy to address as we only need to change the newdata parameter in the predict() function. In particular, instead of newdata = John_Doe, which is one observation, and can change it to newdata = test, which is the entire 9,695 test set.
# Predict the loan status with the three models.
pred_logi_age <- predict(logi_age, newdata = test, type = "response")
pred_logi_int <- predict(logi_int, newdata = test, type = "response")
pred_logi_multi <- predict(logi_multi, newdata = test,
type = "response")
pred_range <- rbind("logi_age" = range(pred_logi_age),
"logi_int" = range(pred_logi_int),
"logi_multi" = range(pred_logi_multi))
aic <- rbind(logi_age$aic, logi_int$aic, logi_multi$aic)
pred_range <- cbind(pred_range, aic)
colnames(pred_range) <- c("min(loan_st)", "max(loan_st)", "AIC")
pred_range
## min(loan_st) max(loan_st) AIC
## logi_age 0.08417892 0.1165453 13580.65
## logi_int 0.05019756 0.3982977 13215.61
## logi_multi 0.01739107 0.4668004 13050.30
Now, instead of the prediction of one single applicant we have conducted a prediction of all 9,695 applicants in the test set. The lower value column corresponds to the lower predicted loan_st for each model. The logistic models produce values in the range of zero to one, and in this case the ranges are rather narrow.
Narrow ranges (the difference between higher and lower loan_st predicted values) could be problematic because the model could not be able to discriminate between defaults (predictions closer to 1) and no-defaults (predictions closer to 0). The higher AIC corresponds to the worst in-sample model and the lower AIC to the best in-sample model. Here, we can see some in and out of sample consistency because the best model according to the AIC, corresponds to the model with the higher prediction range.
Let’s explore all the predicted loan_st values for the logi_age:
ggplot(data.frame(pred_logi_age), aes(x = pred_logi_age)) +
geom_density(fill = "red") +
labs(y = "Density", x = "Default prediction") +
theme(legend.position = "bottom", legend.title = element_blank())
Figure 1.4: Age model prediction histogram.
The logi_age fails to predict values ranging from 0 to 1. In fact, these values are quite concentrated in a very small range of values. As a consequence, this model fails to differentiate between default and no default predictions.
Let’s visualize the predictions of the logi_int and logi_multi. We first collect all predictions in a single data frame just for convenience.
pred_logi <- data.frame(cbind(pred_logi_age, pred_logi_int,
pred_logi_multi))
pred_logi <- gather(pred_logi, key = "model", value = "pred")
Now we plot the logi_int and logi_multi.
ggplot(pred_logi[pred_logi$model != "pred_logi_age",],
aes(x = pred, fill = model)) +
geom_density(alpha = 0.4) +
labs(y = "Density", x = "Default prediction") +
theme(legend.position = "bottom", legend.title = element_blank())
Figure 1.5: Interest rate and multi models predictions histograms.
Let’s add the age model as well.
ggplot(pred_logi, aes(x = pred, y = model, fill = model)) +
geom_boxplot() +
labs(y = "Model", x = "Default prediction") +
theme(legend.position = "none", legend.title = element_blank())
Figure 1.6: Age, Interest rate and multi models predictions boxplot.
Presumably, a model which considers all available predictors could be better for predicting the loan_st.
# Logistic regression model using all available predictors in the data set.
logi_full <- glm(loan_st ~ age + int + grade + log(l_amnt) +
log(income) + emp_len + home + sex +
region, family = "binomial", data = train)
# Loan status predictions for all test set elements.
pred_logi_full <- predict(logi_full, newdata = test, type = "response")
# Look at the predictions range.
range(pred_logi_full)
## [1] 1.422469e-09 8.544424e-01
Now, the pred_logi_full prediction range is wider. A wider range means that the loan_st predictions are now closer to 1. This is good because we need the model to be able to predict both no-defaults (0) and defaults (1). Let’s see a prediction comparison with respect to the rest of the models.
pred_range <- rbind("logi_age" = range(pred_logi_age),
"logi_int" = range(pred_logi_int),
"logi_multi" = range(pred_logi_multi),
"logi_full" = range(pred_logi_full))
aic <- rbind(logi_age$aic, logi_int$aic, logi_multi$aic, logi_full$aic)
pred_range <- cbind(pred_range, aic)
colnames(pred_range) <- c("min(loan_st)", "max(loan_st)", "AIC")
pred_range
## min(loan_st) max(loan_st) AIC
## logi_age 8.417892e-02 0.1165453 13580.65
## logi_int 5.019756e-02 0.3982977 13215.61
## logi_multi 1.739107e-02 0.4668004 13050.30
## logi_full 1.422469e-09 0.8544424 10763.67
pred_logi <- data.frame(cbind(pred_logi_age, pred_logi_int,
pred_logi_multi, pred_logi_full))
pred_logi <- gather(pred_logi, key = "model", value = "pred")
A graphical comparison of the new pred_logi_full and pred_logi_multi.
ggplot(pred_logi[pred_logi$model != "pred_logi_age" &
pred_logi$model != "pred_logi_int" ,],
aes(x = pred, fill = model)) +
geom_density(alpha = 0.4) +
labs(y = "Density", x = "Default prediction") +
theme(legend.position = "bottom", legend.title = element_blank())
Figure 1.7: Multi and full models predictions histograms.
And all of them together.
ggplot(pred_logi, aes(x = pred, y = model, fill = model)) +
geom_boxplot() +
labs(y = "Model", x = "Default prediction") +
theme(legend.position = "none", legend.title = element_blank())
Figure 1.8: Age, Interest rate, multi and full models predictions boxplot.
The logi_full model predictions looks better than the other models.
Another question is: How can we know these model predictions corresponds to a default or no default? The loan status predictions go from 0 to 0.854 for the case of the logi_full model. At the end, these loan status estimations need to be interpreted or classified as a default or no default because we are interested on that. Are they closer enough to 0 to consider a no default? This issue is addressed by setting up a cut-off rate so we can split all estimated loan status into 0 or 1.
Let’s arbitrarily consider a cutoff of 0.15 for now. This means that every estimated loan status below 0.15 will be considered as 0 (no-default), and every estimated loan status above 0.15 will be considered as 1 (default). This classification rule can be used by a financial firm to decide whether to accept or reject new loan applications by rejecting loan applications with an estimated loan status above 0.15, and accepting those with an estimated loan status below 0.15. The accept/reject loan decision now depends on a model estimation. However, this simple approach could be controversial as managers may be interested in allocating more loans, not less. An alternative that meets credit risk and managers interests is to create several cutoff ranges instead of a single cutoff of 0.15. For example, the financial firm may accept loan applications with an estimated loan status between 0.15 and 0.25 with a higher interest rate to compensate for the additional credit risk. On the other hand, we may consider a different view by incorporating a social criterion to help those people who are naturally excluded by the financial industry given their economic condition. If this is the case, then, we may be interested to identify those applicants with a lower estimated loan status.
For now, consider that every estimated loan status below 0.15 will be considered as 0 (no-default), and every estimated loan status above 0.15 will be considered as 1 (default). Graphically it looks like this:
ggplot(pred_logi[pred_logi$model == "pred_logi_full",],
aes(x = pred, fill = model)) +
geom_density(alpha = 0.4) +
geom_vline(xintercept = 0.15, linetype = "longdash") +
labs(y = "Density", x = "Default prediction") +
theme(legend.position = "none", legend.title = element_blank())
Figure 1.9: Full models predictions histogram and cut-off.
Here, predicted loan_st values at the left of the dashed line represent no default predictions and values at the right of the dashed line represent default predictions. Let’s set up the rule to convert the estimated loan status into a binary (0 or 1) variable. See how this transformation takes place:
# Make a binary predictions-vector using a cut-off of 15%
pred_cutoff_15 <- ifelse(pred_logi_full > 0.15, 1, 0)
head(cbind(pred_logi_full, "rounded" = round(pred_logi_full, 4), pred_cutoff_15))
## pred_logi_full rounded pred_cutoff_15
## 1 1.357995e-08 0.0000 0
## 2 2.986278e-08 0.0000 0
## 18 2.299645e-02 0.0230 0
## 26 2.403879e-01 0.2404 1
## 27 1.778986e-01 0.1779 1
## 28 2.600172e-02 0.0260 0
These are only the first 6 rows in the test set. We can see that the rule works as expected because every estimated loan status below 0.15 is now considered as 0 (no-default), and every estimated loan status above 0.15 is considered as 1 (default). The table above shows how we can create this binary variable given the logistic model prediction.
Note that the rows numbers in the table above are 1, 2, 18, 26, 27 and 28. These are not 1, 2, 3, 4, 5 and 6 because the test rows were selected randomly out of the dat. So, the row numbers in the table above correspond to the original place in dat.
Is logi_full a good model after all? We can add a new column to the previous table. This new variable represents what really happened with the loan. Then, the first column is the logistic model prediction, the second column the transformed binary variable given a cutoff of 0.15, and the third column is what actually happened (default or no-default). Let’s take a sample of 10 observations to conduct the comparison easily. Note that the model correctly predicts a no default in most cases. Rows 308 and 329 predict a default incorrectly and row 323 predict a default correctly.
# Let's take from rows 101 to 110.
(cbind(pred_logi_full, pred_cutoff_15,
as.data.frame.numeric(test$loan_st)))[101:110,]
## pred_logi_full pred_cutoff_15 test$loan_st
## 308 1.537106e-01 1 0
## 309 4.137654e-02 0 0
## 310 3.113926e-02 0 0
## 312 5.186962e-09 0 0
## 318 5.337092e-02 0 0
## 319 8.402239e-02 0 0
## 323 2.795535e-01 1 1
## 328 5.384112e-09 0 0
## 329 1.893745e-01 1 0
## 330 9.012136e-03 0 0
There is an easy way to evaluate the rest of the cases in the test set using a simple table called confusion matrix.
# Construct a confusion matrix.
table(test$loan_st, pred_cutoff_15)
## pred_cutoff_15
## 0 1
## 0 6554 2081
## 1 308 752
The logi_full model predicts 6,554 of no-defaults correctly and 752 defaults correctly. However, the model predicts 2,081 defaults that are in fact no-defaults and 308 no-defaults that are in fact defaults. How good are these results? Which of these four values is more important? These are questions we address later. For now, we can say that different models and different cut-off rates lead to different confusion matrix results. Please note that adding all these values leads to 9,695 which are the number of observations in the test set.
You may want to see relative and not absolute values. Let’s try the CrossTable() function instead.
CrossTable(test$loan_st, pred_cutoff_15, prop.r = TRUE,
prop.c = FALSE, prop.t = FALSE, prop.chisq = FALSE)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Row Total |
## |-------------------------|
##
##
## Total Observations in Table: 9695
##
##
## | pred_cutoff_15
## test$loan_st | 0 | 1 | Row Total |
## -------------|-----------|-----------|-----------|
## 0 | 6554 | 2081 | 8635 |
## | 0.759 | 0.241 | 0.891 |
## -------------|-----------|-----------|-----------|
## 1 | 308 | 752 | 1060 |
## | 0.291 | 0.709 | 0.109 |
## -------------|-----------|-----------|-----------|
## Column Total | 6862 | 2833 | 9695 |
## -------------|-----------|-----------|-----------|
##
##
This table is more informative. The logi_full model correctly predicts no-defaults in 75.9% of all the actual no-default cases. In other words, the model fails to predict no-default in 24.1% of the total no-default cases. Now the default. The model correctly predicts default in 70.9% of all the actual default cases (not bad), and fails to predict default in 29.1% of the total default cases.
Instead of arbitrarily consider a cutoff of 0.15, we can follow a different approach. Now consider that we are interested in taking the 20% highest estimates (closer to 1) of pred_logi_full as default. Equivalently, this is to take the lowest 80% pred_logi_full estimates (closer to 0) as no-default. Let’s calculate the new cut-off that meets this new criterion.
# Cutoff definition.
cutoff <- quantile(pred_logi_full, 0.8)
cutoff
## 80%
## 0.1994621
This new approach (taking the 20% highest estimates of pred_logi_full as default) represents a cut-off of 0.1994621. Graphically:
ggplot(pred_logi[pred_logi$model == "pred_logi_full",],
aes(x = pred, fill = model)) +
geom_density(alpha = 0.4) +
geom_vline(xintercept = cutoff, linetype = "longdash") +
labs(y = "Density", x = "Default prediction") +
theme(legend.position = "none", legend.title = element_blank())
Figure 1.10: Full model prediction histogram new cutoff of 0.1994621.
Now the cut-off is 0.1994621. This splits the loan status predictions into two parts: higher than the cut-off is a default, and lower than the cutoff is a no-default. Taking the lowest 80% estimates (closer to 0) as no-default is an arbitrary decision. Here are the cut-off values depending on this arbitrary decision.
cutoff_all <- quantile(pred_logi_full, seq(0.1, 1, 0.1))
data.frame("cut_off" = round(cutoff_all,5))
## cut_off
## 10% 0.00000
## 20% 0.01469
## 30% 0.02410
## 40% 0.03751
## 50% 0.06184
## 60% 0.09617
## 70% 0.14673
## 80% 0.19946
## 90% 0.29227
## 100% 0.85444
We can show a similar summary table as we did with the cutoff of 0.15. Here, we show the predictive ability of the logi_full model and new cut-off of 0.1994621.
# Calculate the predictions with the same model and new cutoff.
pred_full_20 <- ifelse(pred_logi_full > cutoff, 1, 0)
# Show results in a confusion matrix.
CrossTable(test$loan_st, pred_full_20, prop.r = TRUE,
prop.c = FALSE, prop.t = FALSE, prop.chisq = FALSE)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Row Total |
## |-------------------------|
##
##
## Total Observations in Table: 9695
##
##
## | pred_full_20
## test$loan_st | 0 | 1 | Row Total |
## -------------|-----------|-----------|-----------|
## 0 | 7309 | 1326 | 8635 |
## | 0.846 | 0.154 | 0.891 |
## -------------|-----------|-----------|-----------|
## 1 | 447 | 613 | 1060 |
## | 0.422 | 0.578 | 0.109 |
## -------------|-----------|-----------|-----------|
## Column Total | 7756 | 1939 | 9695 |
## -------------|-----------|-----------|-----------|
##
##
With a cutoff of 0.1994621 we accept 7,309 + 447 = 7,756 applications as those are the ones that the model predicts a no default. We can compare both confusion matrix:
cat <- c("correct no-default", "false default",
"false no-default", "correct default")
cut_15 <- c(0.759, 0.241, 0.291, 0.709)
cut_1994621 <- c(0.846, 0.154, 0.422, 0.578)
cbind("classification" = cat, cut_15, cut_1994621)
## classification cut_15 cut_1994621
## [1,] "correct no-default" "0.759" "0.846"
## [2,] "false default" "0.241" "0.154"
## [3,] "false no-default" "0.291" "0.422"
## [4,] "correct default" "0.709" "0.578"
The new cut-off of 0.1994621 improves the identification of no-defaults but worsen the identification of default. Also, the new cut-off fails less in the default and fails more in the no-default. Apparently, there is some sort of trade-off here.
We can also look the detail of 0.1994621 cut-off. Comparing two columns, the one in the left with the actual loan status, and the right column with the estimated loan status.
# Comparative table in detail.
real_pred_20 <- cbind.data.frame(test$loan_st, pred_full_20,
"Did the model succeed?" = test$loan_st==pred_full_20)
# Show some values.
real_pred_20[131:140,]
## test$loan_st pred_full_20 Did the model succeed?
## 398 0 0 TRUE
## 399 1 1 TRUE
## 404 0 1 FALSE
## 414 0 0 TRUE
## 417 1 1 TRUE
## 419 0 0 TRUE
## 420 0 0 TRUE
## 425 1 1 TRUE
## 431 0 0 TRUE
## 435 0 0 TRUE
In this sample the model fails in 1 out of 10 individuals. Not bad at all. Let’s imagine we are a bank. We have a total of 9,695 applications for a loan in our desk (or computer). Assume we use the predictions from pred_full_20 to decide whether we accept a loan application or not. Our model-based acceptance rule is the following: if pred_full_20 = 0 then the model estimates a no-default and we accept the loan application. According to the extract of table above, we fortunately reject application 399, 417 and 425 because that was indeed a default. However, we reject application 404 incorrectly because it did not default. In principle, having a model as a base for a decision rule can lead to better results that guessing or a random approval rule.
Let’s count how many applications are accepted and rejected according to our rule.
# Accepted.
accept_20 <- sum(pred_full_20 == 0)
# Rejected.
reject_20 <- sum(pred_full_20 == 1)
data.frame("total" = length(pred_full_20), accept_20, reject_20)
## total accept_20 reject_20
## 1 9695 7756 1939
Taking the 20% highest estimates of pred_logi_full as default and the lowest 80% pred_logi_full as no-default mean that by construction, we accept 7,756 loan applications (80% of the total) and reject 1,939 (20% of the total). Then, the criterion determines the number of accepted applications, and the model determines which applications to accept/reject.
We can evaluate our loan accept/reject rule as we have the corresponding real values in loan_st. First, let’s illustrate the decision making process given the model estimates.
# First 10 accept decisions.
head(data.frame(real_pred_20[,1:2],
decision = ifelse(real_pred_20$pred_full_20 == 0,
"accept", "reject")), 12)
## test.loan_st pred_full_20 decision
## 1 0 0 accept
## 2 0 0 accept
## 18 1 0 accept
## 26 0 1 reject
## 27 0 0 accept
## 28 0 0 accept
## 30 0 0 accept
## 32 0 0 accept
## 34 0 1 reject
## 35 0 0 accept
## 36 0 0 accept
## 37 1 0 accept
We can add an evaluation column.
# First 10 accept decisions.
head(data.frame(real_pred_20[,1:2],
decision = ifelse(real_pred_20$pred_full_20 == 0,
"accept", "reject"),
evaluation = ifelse(real_pred_20$pred_full_20 == 1,
"we rejected a good customer",
ifelse(real_pred_20$pred_full_20 == 0 & real_pred_20$`test$loan_st` == 0,
"good decision", "bad decision"))), 12)
## test.loan_st pred_full_20 decision evaluation
## 1 0 0 accept good decision
## 2 0 0 accept good decision
## 18 1 0 accept bad decision
## 26 0 1 reject we rejected a good customer
## 27 0 0 accept good decision
## 28 0 0 accept good decision
## 30 0 0 accept good decision
## 32 0 0 accept good decision
## 34 0 1 reject we rejected a good customer
## 35 0 0 accept good decision
## 36 0 0 accept good decision
## 37 1 0 accept bad decision
In the table above, we have the first 12 loan applications. According to our rule, we accept 10 applications and we reject 2 (application #26 and #34). A good decision is because we accept the loan that did not default. A bad decision is because we accept the loan application and defaulted. We also have some cases in which we rejected a good customer and that is not good. This table above is interesting although a bit problematic as it incorporates a counterfactual approach. In particular, we are evaluating the cases in which we reject the application. A more pragmatic approach is to evaluate our rule according to the cases in which we actually accept the loan application. This is the basically a bad rate measure: how many accepted loan applications default?
Remember we accepted 10 and rejected 2 loans. We can identify who default.
# We accept loans that the model predicts a no-default (0).
# In "accepted_loans" we know whether the accepted loans are in fact
# default or no-default.
accepted_loans <- real_pred_20[pred_full_20 == 0, 1]
# The code above says: if we accept the application, tell me what happened.
head(accepted_loans, 10)
## [1] 0 0 1 0 0 0 0 0 0 1
## Levels: 0 1
Note that the third and the tenth application default. These are applications #18 and #37 as expected. Now let’s evaluate not 12 applications but all of them, which are 7,756. The bad rate is now expressed as a percentage of the total:
# bad_rate is the proportion of accepted loans that are in fact default.
bad_rate <- sum(accepted_loans == 1)/length(accepted_loans)
bad_rate
## [1] 0.0576328
This is, by following the model-based rule, we accepted 7,756 loan applications that represent 80% of the total applications. However, 5.76% of those accepted applications were in fact a default. In particular, we accepted 447 loans that are defaults so: \(447/7756=0.0576328\).
Models are not perfect but we are always interested to find out a good model that leads to a lower bad rate because (in principle) we do not want to accept many defaults. If we keep the same model, we could reduce this 5.76% by being more strict in the loan application which in simple terms mean to reduce the acceptance rate. This alternative could be controversial as a lower acceptance rate represents lower income (less customers) for the bank or financial firm. In any case, consider we reduce the acceptance rate from 80% to 65% so we can evaluate the impact over the bad rate.
# New cutoff value.
cutoff <- quantile(pred_logi_full, 0.65)
# Split the pred_logi_full into a binary variable.
pred_full_35 <- ifelse(pred_logi_full > cutoff, 1, 0)
# A data frame with real and predicted loan status.
real_pred_35 <- cbind.data.frame(test$loan_st, pred_full_35)
# Loans that we accept given these new rules.
accepted_loans <- real_pred_35[pred_full_35 == 0, 1]
# Bad rate (accepted loan applications that are defaults).
bad_rate <- sum(accepted_loans == 1)/length(accepted_loans)
# Show the bad rate.
bad_rate
## [1] 0.03713107
As expected, the bad rate is lower (from 5.76%% to 3.71%). This is, the lower the acceptance rate the lower the bad rate. In the extreme, if we accept 0 loan applications then our bad rate would be zero, but doing so is equivalent as going out of business. We can create a function such that given a vector of prediction of loan status we can return the bad rate for different cutoff values. This could be useful to build the bank strategy. This function will reveal the trade-off between the acceptance rate and the bad rate. In particular, the lower the acceptance rate, the lower the income (bad thing) and the lower the bad rate (good thing). So, which combination is the optimal?
A bank could be interested to understand the relationship between the acceptance rate and the bad rate given a model that predicts the loan status.
# Function.
bank <- function(prob_of_def){
cutoff <- rep(NA, 21)
bad_rate <- rep(NA, 21)
accept_rate <- seq(1, 0, by = -0.05)
for (i in 1:21){
cutoff[i] <- quantile(prob_of_def, accept_rate[i])
pred_i <- ifelse(prob_of_def > cutoff[i], 1, 0)
pred_as_good <- test$loan_st[pred_i == 0]
bad_rate[i] <- sum(pred_as_good == 1)/length(pred_as_good)}
table <- cbind(accept_rate, cutoff = round(cutoff, 4),
bad_rate = round(bad_rate, 4))
return(list(table = table, bad_rate = bad_rate,
accept_rate = accept_rate, cutoff = cutoff))
}
We can evaluate this function for the logi_full, and a bad model like the logi_age. In principle, we expect the logi_full model to have a more attractive relationship between the acceptance rate and the bad rate. This is, lower bad rates for a given acceptance rate. Any financial institution could be interested in increasing the acceptance rate without increasing too much the bad rate.
Let’s apply the function to the pred_logi_full and the logi_age.
# Apply the bank function.
bank_logi_full <- bank(pred_logi_full)
bank_logi_age <- bank(pred_logi_age)
data.frame(accept_rate = bank_logi_age$accept_rate,
"Bad_model_bad_rate" = bank_logi_age$bad_rate,
"Good_model_bad_rate)" = bank_logi_full$bad_rate)
## accept_rate Bad_model_bad_rate Good_model_bad_rate.
## 1 1.00 0.10933471 0.109334709
## 2 0.95 0.10849715 0.090662324
## 3 0.90 0.10849715 0.076790831
## 4 0.85 0.10849715 0.066626214
## 5 0.80 0.10547397 0.057632800
## 6 0.75 0.10547397 0.050612020
## 7 0.70 0.10396251 0.043619216
## 8 0.65 0.10396251 0.037131070
## 9 0.60 0.10092961 0.029396596
## 10 0.55 0.10092961 0.023443361
## 11 0.50 0.09933775 0.021039604
## 12 0.45 0.10206865 0.018565207
## 13 0.40 0.10206865 0.015471893
## 14 0.35 0.09907039 0.011788977
## 15 0.30 0.09972041 0.009281540
## 16 0.25 0.10343554 0.005363036
## 17 0.20 0.09722922 0.003610108
## 18 0.15 0.09360190 0.001374570
## 19 0.10 0.10125362 0.000000000
## 20 0.05 0.10516605 0.000000000
## 21 0.00 0.00000000 0.000000000
The full model is superior because at any acceptance rate we can reach a lower bad rate. A plot can reveal the main differences of these two models: logi_full and logi_age.
# Plot the strategy functions
par(mfrow = c(1, 2))
plot(bank_logi_full$accept_rate, bank_logi_full$bad_rate,
type = "l", xlab = "Acceptance rate", ylab = "Bad rate",
lwd = 2, main = "logi_full")
abline(v = bank_logi_full[["accept_rate"]][8], lty = 2)
abline(h = bank_logi_full[["bad_rate"]][8], lty = 2)
abline(v = bank_logi_full[["accept_rate"]][5], lty = 2, col = "red")
abline(h = bank_logi_full[["bad_rate"]][5], lty = 2, col = "red")
plot(bank_logi_age$accept_rate, bank_logi_age$bad_rate,
type = "l", xlab = "Acceptance rate",
ylab = "Bad rate", lwd = 2, main = "logi_age")
abline(v = bank_logi_age[["accept_rate"]][8], lty = 2)
abline(h = bank_logi_age[["bad_rate"]][8], lty = 2)
abline(v = bank_logi_age[["accept_rate"]][5], lty = 2, col = "red")
abline(h = bank_logi_age[["bad_rate"]][5], lty = 2, col = "red")
Figure 1.11: A good and a bad model.
The logi_full model is better because for any acceptance rate we can reach a lower bad rate compared with the logi_age. This is because the logi_full model can identify defaults and no defaults with higher precision compared with the logi_age. The value of a good model is that it can help us to make better business decisions, in this case better credit evaluation decisions.
The model ability to predict defaults and no defaults can be measured by the AUC. The AUC can be defined as the probability that the fit model will score a randomly drawn positive sample higher than a randomly drawn negative sample. AUC stands for area under the curve in the following context:
ROC_logi_full <- roc(test$loan_st, pred_logi_full)
ROC_logi_age <- roc(test$loan_st, pred_logi_age)
# Draw the ROCs on one plot
plot(ROC_logi_full, col = "red")
lines(ROC_logi_age, col = "blue")
Figure 1.12: Full model in red, age model in blue.
Sensitivity is the model ability to correctly identify defaults, these are known as true positive. Specificity is the model ability to correctly identify no-default loans, these are known as true negative.
As expected, the area under the curve (AUC) is higher for the red line which corresponds to the logi_full model. We can calculate the exact values:
# Compute the AUCs:
auc(ROC_logi_full)
## Area under the curve: 0.8213
auc(ROC_logi_age)
## Area under the curve: 0.5301
Note that the logi_age has an AUC of 0.5301. This is close to a loan approval process in which we randomly accept and reject with no further analysis. In other words, the logi_age is so bad that it is almost equivalent as using no model at all and accept and reject loan applications based on a random rule. A pure-random approval rule would look like this:
set.seed(2020)
pred_rand_model <- runif(length(pred_logi_age))
ROC_rand <- roc(test$loan_st, pred_rand_model)
# Draw the ROCs on one plot
plot(ROC_rand, col = "orange")
Figure 1.13: Random rule model.
auc(ROC_rand)
## Area under the curve: 0.5105
In theory, this random evaluation process would lead to an AUC of \(0.5 = (1 \times 1)/2\). In contrast, now imagine we have a perfect model:
pred_perfect_model <- as.numeric(test$loan_st)
ROC_perfect <- roc(test$loan_st, pred_perfect_model)
plot(ROC_perfect)
Figure 1.14: Perfect model.
auc(ROC_perfect)
## Area under the curve: 1
In a perfect model, the AUC is equal to 1 (\(1 \times 1\)). The model correctly identify defaults (100% sensitivity) and at the same time the model correctly identify no-defaults (100% specificity).
library(knitr)
library(plotly)
library(Sim.DiffProc)
library(scatterplot3d)
library(vembedr)
The Merton model is useful when we are interested in evaluating the credit risk of public firms. In short, it evaluates how likely it is that the value of the future firm’s assets fall below a certain threshold represented by the firm’s debt. By doing that, the model is able to estimate a probability of default among other results. The model can be used as a tool to propose changes (for example) in the balance sheet to reduce the credit risk exposure of a firm.
The philosophy of these models goes back to Black and Scholes (1973), Merton (1973) and Merton (1974), they consider corporate liabilities as contingent claims on the assets of the firm.
The estimation of the credit risk Merton’s model requires a minimization of a function. In this section, we show a simple minimization example. Here are some useful online resources for further references.
The objective function to be minimized is \(f(a,b)=\sum(a + bx_i-y_i)^2\). We have six values for \(x\) and \(y\), and we are expected to minimize the function \(f\) by finding the parameter values \(a\) and \(b\). The \(x_i\) and \(y_i\) data looks like this:
# Data.
dat <- data.frame(x = c(1, 2, 3, 4, 5, 6), y = c(1, 3, 5, 6, 8, 12))
kable(dat, caption = "Observations.", row.names = TRUE) # The table.
| x | y | |
|---|---|---|
| 1 | 1 | 1 |
| 2 | 2 | 3 |
| 3 | 3 | 5 |
| 4 | 4 | 6 |
| 5 | 5 | 8 |
| 6 | 6 | 12 |
Graphically:
# A scatter plot.
plot(y ~ x, data = dat, pch = 19, cex = 2)
Figure 2.1: Original data.
Let’s implement the \(f(a,b)=\sum(a + bx_i-y_i)^2\) minimization.
# Function that calculates the residual sum of squares.
min.RSS <- function(data, par) {
with(data, sum((par[1] + par[2] * x - y)^2)) }
# Function minimization.
result <- optim(par = c(0, 1), min.RSS, data = dat)
a <- result$par[1]
b <- result$par[2]
f <- sum((a + b * dat$x - dat$y)^2)
# Minimization output.
ans <- data.frame(a, b, f)
kable(ans, caption = "Minimization results f(a,b).")
| a | b | f |
|---|---|---|
| -1.266846 | 2.02862 | 2.819048 |
The minimization result is represented by a line, and the line is characterized by an intercept equal to \(-1.266846\) and a slope equal to \(2.02862\).
We can plot it and verify that this is indeed the right answer.
# View the results.
plot(y ~ x, data = dat, pch = 19, ylim = c(-2, 12), xlim = c(0, 6), cex = 2)
abline(a, b, col = "red", lwd = 3)
abline(h = 0, lty = 2)
abline(v = 0, lty = 2)
Figure 2.2: Line that minimize the residual sum of squares.
The function that we minimize here is basically the OLS criterion. This is why the regression model leads to the same results.
# The previous example is equivalent as a linear regression model.
reg <- lm(y ~ x, data = dat)
# See the results.
summary(reg)
##
## Call:
## lm(formula = y ~ x, data = dat)
##
## Residuals:
## 1 2 3 4 5 6
## 0.2381 0.2095 0.1810 -0.8476 -0.8762 1.0952
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.2667 0.7815 -1.621 0.180388
## x 2.0286 0.2007 10.109 0.000539 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8395 on 4 degrees of freedom
## Multiple R-squared: 0.9623, Adjusted R-squared: 0.9529
## F-statistic: 102.2 on 1 and 4 DF, p-value: 0.000539
The function that we minimize in the estimation of the credit risk Merton model is more elaborated, but we follow the same principle as above.
For an introductory reference to understand the basics of credit see Brealey et al. (2020).
Let’s follow the example 24.3 in (Hull 2015). We have five parameters: \(E_0=3\), \(\sigma_E=0.8\), \(rf=0.05\), \(T=1\), \(D=10\) that lead to an estimate of the value of the assets today \(V_0\) and the volatility of the assets \(\sigma_V\). With these seven parameters we can estimate the probability of default \(pd\) among other interesting results.
These are the known parameters.
# List of known parameters.
E0 <- 3 # Value of the equity as today, this is market capitalization.
se <- 0.8 # Stock return volatility.
rf <- 0.05 # Risk-free rate.
TT <- 1 # Maturity.
D <- 10 # Value of the debt. The Bloomberg function DDIS is useful.
We need equations 24.3 and 24.4 to estimate \(V_0\) and \(\sigma_V\).
eq24.3 <- function(V0, sv) {
((V0 * pnorm((log(V0 / D) + (rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) -
D * exp(-rf * TT) * pnorm(((log(V0 / D) +
(rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) - sv * sqrt(TT)) - E0)) }
eq24.4 <- function(V0, sv) {
((pnorm((log(V0 / D) + (rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) *
sv * V0 - se * E0)) }
# Footnote 10 indicates that we should minimize F(x,y)^2+G(x,y)^2
min.footnote10 <- function(x)
(eq24.3(x[1], x[2]))^2 + (eq24.4(x[1], x[2]))^2
# The minimization leads to the values of V0 and sv.
V0_sv <- optim(c(1, 1), min.footnote10)
# Define the values as parameters.
V0 <- V0_sv$par[1]
sv <- V0_sv$par[2]
kable(data.frame("Market value" = V0,
"Volatility" = sv),
caption = "Assets.")
| Market.value | Volatility |
|---|---|
| 12.39578 | 0.2123041 |
Calculate the probability of default \(N(-d_2)\) as a function of \(V_0\) and \(\sigma_V\).
PD <- function(V0, sv) {
pnorm(-(((log(V0 / D) + (rf + sv^2 / 2) * TT) /
(sv * sqrt(TT))) - sv * sqrt(TT))) }
# Calculate the probability of default given the previous parameters.
pd <- PD(V0, sv)
pd
## [1] 0.1269396
The probability of default is 12.69396%.
Let’s analyze the role of the values of \(d_1\), \(d_2\), \(N(d_1)\) and \(N(d_2)\). First, we isolate these values.
# Inspect the role of d and N(d).
d1 <- (log(V0 / D) + (rf + sv^2 / 2) * TT) / (sv * sqrt(TT))
d2 <- d1 - sv * sqrt(TT)
Nd1 <- pnorm(d1)
Nd2 <- pnorm(d2)
ans <- data.frame(d1, Nd1, d2, Nd2)
ans
## d1 Nd1 d2 Nd2
## 1 1.353282 0.9120172 1.140978 0.8730604
Now, let’s illustrate how \(N(-d_2)\) lead to the probability of default given the properties of a standard normal distribution.
xseq <- seq(-4, 4, 0.01)
densities <- dnorm(xseq, 0, 1) # Standard normal distribution.
# Graphical evaluation of N(-d_2).
plot(xseq, densities, xlab = "d values (-d_2 is the dotted line)",
ylab = "Density", type = "l", lwd = 2, cex = 2)
abline(h = 0)
polygon(c(xseq[xseq >= -d2], -d2), c(densities[xseq >= -d2], 0), col = "green")
polygon(c(xseq[xseq < -d2], -d2), c(densities[xseq < -d2], 0), col = "red")
legend("topleft", legend = c("N(d_2)=0.8730604
in green and
N(-d_2)=0.1269396 in red.
N(d_2)+N(-d_2)=1."),
bg = "white", bty = "n", cex = 0.9)
abline(v = -d2, lty = 2, lwd = 3)
Figure 2.3: The red area represents the probability of default.
Now, let’s analyze the minimization that leads to \(V_0\) and \(\sigma_V\). We do not have an analytic math expressions (closed form) to calculate \(V_0\) and \(\sigma_V\), instead we have approximate values or estimates given the minimization of equations 24.3 and 24.4 in (Hull 2015). In order to see how good this approximation is, we can retrieve the value of \([F(x,y)]^2+[G(x,y)]^2\) evaluated at \(x=V_0\) and \(y=\sigma_v\). This value is supposed to be positive as both \(F(x,y)\) and \(G(x,y)\) functions are squared, so we expect a positive number close to zero when evaluated at \(x=V_0\) and \(y=\sigma_v\).
# Evaluate how good is the minimization.
min.footnote10(x = c(V0, sv))
## [1] 1.425797e-07
round(min.footnote10(x = c(V0, sv)), 6)
## [1] 0
The minimization conducted in the function min.footnote10() above worked fairly well as the value is zero in practical terms.
There is another way to approach how the minimization work. We can propose a graphical analysis. In particular, we can verify whether \(V_0=12.39578\) and \(\sigma_V=0.2123041\) minimizes the objective function. To illustrate this in two axis, we need to fix either \(V_0\) or \(\sigma_V\), and evaluate \([F(x,y)]^2+[G(x,y)]^2\) with different values of \(V_0\) or \(\sigma_V\). Let’s do that.
# Fix sv and evaluate different values of V0.
V0.seq <- seq(from = 4, to = 20, length.out = 100)
sv.rep <- rep(sv, 100)
# Function to be evaluated.
FG <- function(V0, sv) { (eq24.3(V0, sv))^2 + (eq24.4(V0, sv))^2 }
# Apply the function with fixed sv and different V0 values.
FG.V0 <- mapply(FG, V0.seq, sv.rep)
We are ready to plot.
# Plot the results.
plot(V0.seq, FG.V0, type = "l", xlab = expression(paste(V[0])),
ylab = expression(paste(F(x,y)^2+G(x,y)^2)), cex.lab = 0.8, lwd = 3)
abline(v = V0, lty = 2)
abline(h = FG(V0, sv), lty = 2)
points(V0, FG(V0, sv), pch = 1, cex = 3, col = "red", lwd = 2)
Figure 2.4: Here we fix the volatility of the assets and change the value of the assets at time zero. The minimization looks fine.
Clearly, the minimization worked well here. For the sake of completeness, now we fix the \(V_0\) value and evaluate the function again.
# Now fix V0, and evaluate different sv values.
sv.seq <- seq(0, 0.6, length.out = 100)
V0.rep <- rep(V0, 100)
# Evaluate the function with these parameters.
FG.sv <- mapply(FG, V0.rep, sv.seq)
# Plot the results.
plot(sv.seq, FG.sv, type = "l", xlab = expression(paste(sigma[V])),
ylab = expression(paste(F(x, y)^2 + G(x, y)^2)),
cex.lab = 0.8, lwd = 3)
abline(v = sv, lty = 2)
abline(h = FG(V0, sv), lty = 2)
points(sv, FG(V0, sv), pch = 1, cex = 3, col = "red", lwd = 2)
Figure 2.5: Here we fix the value of the assets at time zero and change the volatility of the assets. The minimization looks fine.
We can show the same as before but in a 3D plot.
# outer evaluates the function for each combination of the vectors.
z <- outer(V0.seq, sv.seq, FG) # z is now 100x100 matrix.
persp(V0.seq, sv.seq, z, col = "springgreen", shade = 0, xlab = "V0",
ylab = "sv", zlab = "Objective function.")
Figure 2.6: Minimization of the function.
Contour plots or level plots are a way to show a three-dimensional surface on a two-dimensional plane.
# The nlevels is high to emphasize the minimum value in blue.
contour(V0.seq, sv.seq, z, xlab = expression(paste(V[0])),
ylab = expression(paste(sigma[V])),
lwd = 2, nlevels = 300)
points(V0, sv, pch = 19, col = "blue", cex = 2)
Figure 2.7: Minimization of the function: a contour view.
The following plot cannot be seen in a PDF output, but it can be seen in a browser as an interactive plot. This green plot is nice but it is not interactive. We could plot interactive three-dimension plots with other R packages like plotly and show them in a html context or java environment.
plot_ly(type = "surface" , x = sv.seq, y = V0.seq , z = z ) %>%
layout(#title = "Minimization of f at V0=12.395 an sv=0.212",
scene = list(xaxis = list(title = "Assets volatility"),
yaxis = list(title = "Assets",
zaxis = list(title = "f=F^2+G^2")))) %>%
hide_colorbar()
Figure 2.8: Minimization of the function: an interactive view.
#add_markers(y = 12.3957765, x = 0.2123041, z = 0, inherit = TRUE)
The Merton model allows us to calculate more values in this credit risk assessment.
# Other results in the Merton's model.
market_value_debt <- V0 - E0
pv_promised_payment_debt <- D * exp(-rf * TT)
Expected_Loss <- (pv_promised_payment_debt - market_value_debt) /
pv_promised_payment_debt
recovery_rate <- 1 - (Expected_Loss / pd)
ans <- data.frame(market_value_debt, pv_promised_payment_debt,
Expected_Loss, recovery_rate, pd)
kable(t(ans), caption = "Other results in the Merton's model.")
| market_value_debt | 9.3957765 |
| pv_promised_payment_debt | 9.5122942 |
| Expected_Loss | 0.0122492 |
| recovery_rate | 0.9035039 |
| pd | 0.1269396 |
Could we figure out the implied bond yield spread? It would be similar to the fair interest rate that this firm has to pay given its probability of default and recovery rate. Take Hull’s equation 24.2 as a reference: \(\lambda(T)=S(T)/(1-R)\) and solve for \(S(T)\): \(S(T)=\lambda(T)\times(1-R)\), so \(S(T)=(0.12693963)\times(1-0.90350393)\). This is 122.4918 basis points more than the risk-free rate. Then, with five parameters we can calculate (among other things) the firm’s probability of default and the correspondent yield spread of this firm’s bond. It would be interesting to compare this value with the market yield spread and evaluate whether the market yield is over or underestimated according to our theoretical value.
This is the spread function.
spread <- function(E0, se, rf, TT, D) {
eq24.3 <- function(V0, sv) {
((V0 * pnorm((log(V0 / D) + (rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) -
D * exp(-rf * TT) * pnorm(((log(V0 / D) +
(rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) - sv * sqrt(TT)) - E0)) }
eq24.4 <- function(V0, sv) {
((pnorm((log(V0 / D) + (rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) *
sv * V0 - se * E0)) }
min.footnote10 <- function(x)
(eq24.3(x[1], x[2]))^2 + (eq24.4(x[1], x[2]))^2
V0_sv <- optim(c(1, 1), min.footnote10)
V0 <- V0_sv$par[1]
sv <- V0_sv$par[2]
pd <- pnorm(-(((log(V0 / D) + (rf + sv^2 / 2) * TT) /
(sv * sqrt(TT))) - sv * sqrt(TT)))
market_value_debt <- V0 - E0
pv_promised_payment_debt <- D * exp(-rf * TT)
Expected_Loss <- (pv_promised_payment_debt - market_value_debt) /
pv_promised_payment_debt
recovery_rate <- 1 - (Expected_Loss / pd)
spread <- pd * (1-recovery_rate)
spread
}
Let’s evaluate the spread function.
spread(E0 = 3, se = 0.8, rf = 0.05, TT = 1, D = 10)
## [1] 0.01224917
The model assumes that the assets follow a geometric Brownian motion stochastic process. A regular Wiener stochastic process looks like this.
embed_url("https://youtu.be/X-VRWeqG_I8")
Let’s go back to our main example and assume the daily evolution of the market value of the firm assets over a year. Here are 10 simulated paths of \(V_t\) from \(t=0,...,T\). The reference of this section is (Hull 2015) Chapter 14 and 15.
# Merton assumes V follows a geometric Brownian motion process.
V0.sim <- function(i) { # the argument i is not used here.
GBM(N = 365, T = 1, t0 = 0, x0 = V0, theta = 0.05, sigma = sv) }
set.seed(3) # Reproducibility
paths <- sapply(1:10, V0.sim) # Create 10 paths for V.
# Plot results.
matplot(paths, type = "l", col = "black", lwd = 1, lty = 1,
ylab = expression(paste(V[t])),
xlab = "Time in days (from 0 to T)")
abline(h = D, col = "red", lwd = 2)
points(1, V0, pch = 19, cex = 1.5, col = "blue")
points(rep(366, 8), paths[366, paths[366, ] > 10],
pch = 1, cex = 1.5, col = "green")
points(rep(366, 2), paths[366, paths[366, ] < 10],
pch = 1, cex = 1.5, col = "red")
Figure 2.9: Simulation of 10 paths of the assets.
These simulations can be interpreted as the evolution of the value of the firm’s assets in 10 parallel universes. In the past, some students have called this figure la gráfica de los pelitos, this is OK as long as you understand the figure implications. The right name is 10 simulated geometric Brownian motion processes. Let’s count the cases in which \(V_t<D\) note that here I use \(t\) and not \(T\), so we are counting the cases in which at least at some time \(t\) the value of the assets were lower than 10.
# Which path went lower than D?
which(colSums(paths < 10) > 0)
## [1] 2 4 9
The value of the assets was lower than $10 at some time in the second, fourth and ninth alternate universes. Let’s plot these cases.
# There might be an easier way to select these cases.
matplot(paths[,which(colSums(paths < 10) > 0)], type = "l",
col = c("green", "blue", "red"), lwd = 2, lty = 1,
ylab = expression(paste(V[t])),
xlab = "Time in days (from 0 to T)")
abline(h = D, col = "red", lwd = 2)
points(1, V0, pch = 19, cex = 1.5, col = "blue")
Figure 2.10: Note that the blue path finally did not default. Recovery is possible as in the real life.
The blue path was lower than 10 at some time, but at \(t=T\) it is clearly higher than 10. For this reason, the blue path does not represent a firm’s default. See how the assets are not high enough to pay the debt at maturity only in two cases: green and red. Thus, according to this simulation the probability of default is 20%. Here is how we can calculate the cases in which \(V_T<D\).
sum(paths[366,] < 10) / 10
## [1] 0.2
The probability of default of 20% can be disappointing because it is significantly higher than 12.693963%. This is because the number of simulations is small. The following examples show how these values tend to converge as we increase the number of simulations from 10 to 100 and to 100,000.
Here are 100 simulated paths of \(V_0\) to \(V_T\).
set.seed(3) # Reproducibility
paths <- sapply(1:100, V0.sim) # Create 100 paths.
# Plot results.
matplot(paths, type = "l", col = "black", lwd = 1, lty = 1,
ylab = expression(paste(V[t])),
xlab = "Time in days (from 0 to T)")
abline(h = D, col = "red", lwd = 2)
points(1, V0, pch = 19, cex = 1.5, col = "blue")
Figure 2.11: Simulation of 100 paths of the assets.
Given the simulation above, the probability of default is 17%. This is closer to 12.693963%.
sum(paths[366,] < 10) / 100
## [1] 0.17
Let’s see all the 17 cases that end in default.
# These are the 17 default cases.
matplot(paths[,which(paths[366,] < 10)], type = "l", col = "black",
lwd = 1, lty = 1, ylab = expression(paste(V[t])),
xlab = "Time in days (from 0 to T)")
abline(h = D, col = "red", lwd = 2)
points(1, V0, pch = 19, cex = 1.5, col = "blue")
Figure 2.12: Default cases.
Now, let’s see the 11 cases in which \(V_t\) eventually went lower than $10 but at the end did not default.
# V_t < 10
almost_default <- which(colSums(paths < 10) > 0)
# V_T < 10
default <- which((paths[366,] < 10))
# There should be an easier way to do this:
matplot(paths[,almost_default[which(almost_default %in%
default == FALSE)]],
type = "l", col = c(1:11), lwd = 1, lty = 1,
ylab = expression(paste(V[t])),
xlab = "Time in days (from 0 to T)")
abline(h = D, col = "red", lwd = 2)
points(1, V0, pch = 19, cex = 1.5, col = "blue")
Figure 2.13: All these paths went lower than 10 at some point, but at the end the value of the assets are higher than 10. These are paralell universes in which the firm finally survived after some drama.
Here are 100,000 simulated \(V_T\) showed in a histogram. The reference for this section is 15.1 ((Hull 2015)). Note that instead of drawing the complete paths, we can directly get the distribution of \(V_T\). The resulting distribution is log-normal.
set.seed(13)
# 100,000 values of VT at once.
VT <- exp(rnorm(100000, log(V0) + (rf - (sv^2) / 2)*TT, sv*sqrt(TT)))
# Plot results.
h <- hist(VT, 100, plot = FALSE)
ccat <- cut(h$breaks, c(-Inf, D, Inf))
# You may also need to change the xlim if a big red square appears as changes in maturity may imply larger values of VT. I recommend trying without xlim and see.
plot(h, main = NULL, col = c("red", "blue")[ccat],
xlab = "Value of the assets at maturity", xlim = c(4, 30))
legend("topright",
legend = c("The red area is 12.657%, and represents
the simulated probability of default.
The probability of default of the model
following the textbook example is
N(-d_2)=12.69396%. As we can see,
both are very close."),
col = c("red", "blue"), pch = c(19), bg = "white", bty = "n", cex = 0.8)
Figure 2.14: Histogram of 100,000 values of the assets at maturity.
The probability of default can be calculated as:
sum(VT < 10) / 100000
## [1] 0.12657
Note that 0.12657 is now much closer than 0.12693963. Convergence achieved.
This firm can be represented as an European call option payoff. The payoff of a typical stock option is \(c=max(S_T-K, 0)\) or equivalently in terms of Merton’s model: \(E_T=max(V_T-D, 0)\).
ET <- pmax(VT - D, 0) # payoff function of a typical call option.
plot(sort(VT), sort(ET), type = "l", lwd = 4, xlim = c(5, 20),
ylim = c(0, 10), xlab = "Simulated assets at maturity (V_T)",
ylab = "Simulated equity at maturity (E_T)")
abline(v = 10, lty = 2)
legend("left", legend = c("According to the model,
E_T=0 happens
in 12.657% of the
100,000 cases."),
bg = "white", bty = "n", cex = 0.9)
Figure 2.15: A view of the firm’s balance sheet in the future. Looks like a typical European call option payoff.
This diagram show the firm’s market values in the future at maturity. This is because the \(E_T\) is a function of \(V_T\) and \(D\). As long as \(V_T<D\), then the value of \(E_T=0\). Otherwise, the value of \(E_T= V_T-D\), just as the accounting equation suggest but in future terms. Note that the Merton’s model says nothing about the estimated value of \(E_T\), instead we have an estimate about how likely is that \(V_T<D\), or in other words how likely is that \(E_T<0\).
Then, we know that \(E_T= max(V_T-D, 0)\). The Black-Scholes-Merton formula can also estimate the theoretical value of \(E_0\). This is, if the observable market value given by the market capitalization available in any financial site is \(E_0=3\), we can retrieve the theoretical value of \(E_0\) so we can compare whether the firm observable market value is over or undervalued. In other words, the Black-Scholes-Merton model can give us an estimate of the firm value. In particular, the model gives the value of the equity today as:
\(E_0=V_0N(d_1)-De^{-rT}N(d_2)\)
This is equation 24.3 in (Hull 2015). To estimate \(E_0\), we need \(V_0\) and \(\sigma_V\) and these two values were already estimated before with the implementation of the min.footnote10 function above. It is true that this minimization requires \(E_0\), the point here is to use the observable market capitalization \(E_0=3\) to find out the unobservable \(V_0\) and \(\sigma_V\) and then estimate the theoretical value of \(E_0\).
The code is:
# Black-Scholes call.
E0.theo <- function(V0, D, rf, TT, sv) {
d1 <- (log(V0 / D) + (rf + sv^2 / 2) * TT) / (sv * sqrt(TT))
d2 <- d1 - sv * sqrt(TT)
c <- V0 * pnorm(d1) - D * exp(-rf * TT) * pnorm(d2)
}
V0 <- 12.39578 # Assets.
D <- 10 # Debt.
rf <- 0.05 # Risk-free rate.
TT <- 1 # Maturity.
sv <- 0.2123041 # Volatility of assets.
(E0.theo(V0, D, rf, TT, sv))
## [1] 3.000357
Then, we can argue that the market value of the firm is slightly undervalued since the theoretical value of the firm is 3.000357 compared with 3. This approach opens the possibility to value firms. It also opens the possibility to compare the book value in the Balance sheet of the total assets versus the estimate \(V_0\). This will allow us to understand the difference between book value and market value of the assets, not only of the equity. We can even implement a similar analysis about the market value of the debt just as we did before.
We can expand our analysis by evaluating how changes in parameters change the probability of default in the context of the Merton model. This is interesting because we can move from estimating a probability of default to propose changes in the firm’s parameters like the capital structure to reduce a firm’s probability of default. First, we need the probability of default as a function.
pd <- function(E0, se, rf, TT, D) {
eq24.3 <- function(V0, sv) {
((V0 * pnorm((log(V0 / D) + (rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) -
D * exp(-rf * TT) * pnorm(((log(V0 / D) +
(rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) - sv * sqrt(TT)) - E0)) }
eq24.4 <- function(V0, sv) {
((pnorm((log(V0 / D) + (rf + sv^2 / 2) * TT) / (sv * sqrt(TT))) *
sv * V0 - se * E0)) }
min.footnote10 <- function(x)
(eq24.3(x[1], x[2]))^2 + (eq24.4(x[1], x[2]))^2
V0_sv <- optim(c(1, 1), min.footnote10)
V0 <- V0_sv$par[1]
sv <- V0_sv$par[2]
pd <- pnorm(-(((log(V0 / D) + (rf + sv^2 / 2) * TT) /
(sv * sqrt(TT))) - sv * sqrt(TT)))
pd }
Now we have a convenient function: \(pd = f(E_0, \sigma_E, rf, T, D)\). Remember these five parameters are required to estimate \(V_0\) and \(\sigma_V\) before calculating \(pd\).
# Evaluate the case of the textbook example.
pd1 <- pd(3, 0.8, 0.05, 1, 10)
pd1
## [1] 0.1269396
Now, we create vectors of parameters and evaluate the pd function.
l = 50 # Vectors of length 50.
# Create vectors of the parameters.
x.E <- seq(from = 1, to = 20, length.out = l)
x.rf <- seq(0, 4, length.out = l)
x.D <- seq(1, 20, length.out = l)
x.T <- seq(0.5, 20, length.out = l)
x.se <- seq(0, 3, length.out = l)
# Evaluate the pd at different values.
pd.E <- mapply(pd, x.E, se, rf, TT, D)
pd.rf <- mapply(pd, E0, se, x.rf, TT, D)
pd.D <- mapply(pd, E0, se, rf, TT, x.D)
pd.T <- mapply(pd, E0, se, rf, x.T, D)
pd.se <- mapply(pd, E0, x.se, rf, TT, D)
Let’s plot the probability of default as a function of equity. What we are doing here is to evaluate the probability of default function 50 times, as the \(E_0\) has 50 values from 1 to 20. By doing this, we end up with 50 values of the probability of default.
plot(x.E, pd.E, type = "l", ylab = "Probability of default",
xlab = "Equity at time zero", lwd = 3, ylim = c(0, 0.15))
lines(x.E, pd.E)
abline(v = E0, lty = 2)
abline(h = pd1, lty = 2)
points(E0, pd1, pch = 1, cex = 3, col = "red", lwd = 2)
Figure 2.16: The probability of default as a function of the equity at time zero.
These relationships between the parameters and the probability of default are all non-linear. This is interesting because according to the model, the current levels of the parameters are important when deciding which parameter might lead to a high or low impact over the probability of default.
plot(x.rf, pd.rf, type = "l", ylab = "Probability of default",
xlab = "Risk-free rate", lwd = 3)
abline(h = 0, lty = 2)
abline(v = rf, lty = 2)
abline(h = pd1, lty = 2)
points(rf, pd1, pch = 1, cex = 3, col = "red", lwd = 2)
Figure 2.17: The probability of default as a function of the risk-free rate.
Probability of default as a function of debt. Note that adding 1 or subtracting 1 unit of debt has different impact over the probability of default.
plot(x.D, pd.D, type = "l", ylab = "Probability of default",
xlab = "Debt", lwd = 3, ylim = c(0, 0.15))
abline(h = 0, lty = 2)
abline(v = D, lty = 2)
abline(h = pd1, lty = 2)
points(D, pd1, pch = 1, cex = 3, col = "red", lwd = 2)
Figure 2.18: The probability of default as a function of the debt.
Probability of default as a function of maturity. Note that asking for a higher debt maturity (as in a negotiation), do not lead to a lower probability of default if we keep the rest of the parameters unchanged in our firm. Imagine we arrange a meeting with the bank manager and we ask for more time to pay our debt. The bank manager should not (in principle) extend the debt maturity as long as we commit to do some changes in the firm. For example, we can negotiate a longer maturity and promise to increase the firm’s equity.
plot(x.T, pd.T, type = "l", ylab = "Probability of default",
xlab = "Maturity", lwd = 3)
abline(h = 0, lty = 2)
abline(v = TT, lty = 2)
abline(h = pd1, lty = 2)
points(TT, pd1, pch = 1, cex = 3, col = "red", lwd = 2)
Figure 2.19: The probability of default as a function of the debt’s maturity.
Probability of default as a function of the volatility of the equity \(\sigma_E\).
plot(x.se, pd.se, type = "l", ylab = "Probability of default",
xlab = "Standard deviation of equity", lwd = 3)
abline(h = 0, lty = 2)
abline(v = se, lty = 2)
abline(h = pd1, lty = 2)
points(se, pd1, pch = 1, cex = 3, col = "red", lwd = 2)
Figure 2.20: The probability of default as a function of the volatility of the equity.
In the previous plots we evaluate the probability of default function by changing one of the following parameters: \(E_0\), \(\sigma_E\), \(rf\), \(T\) or \(D\). We were able to do that because we constructed a vector of 50 different values for these five parameters. Note that \(V_0\) and \(\sigma_V\) do not remain fixed because they are at the same time a function of \(E_0\), \(\sigma_E\), \(rf\), \(T\) and \(D\). Now, let’s start by plotting the probability of default as a function of \(E_0\) and \(\sigma_E\).
One alternative is to evaluate the probability of default function 50 times by taking the 50 pairs of values of \(E_0\) and \(\sigma_E\). By doing that, we will end with three vectors of size 50 that we can plot as a three dimension scatter plot.
ED <- mapply(pd, x.E, x.se, rf, TT, D)
p.ED <- scatterplot3d(x.E, x.se, ED, pch = 16, type = "h", color = 1:50,
angle = 100, xlab = "Equity at time zero",
ylab = "Volatility of equity",
zlab = "Probability of default")
p.ED$points3d(E0, se, pd1, type = "h", col = "red", pch = 20, cex = 3)
Figure 2.21: The probability of default as a function of the equity at time zero and the volatility of equity.
The red point represent the case of the original textbook example in which \(pd = f(E_0=3, \sigma_E=0.8,rf=0.05,T=1,D=10)=0.1269396\). Note that the red point is not part of the 50 plotted observations simply because there is no case in which the probability of default function is evaluated in \(E_0=3, \sigma_E=0.8\). Although the plot above is not incorrect, it might be incomplete as we are not showing all possible values of the probability of default. One alternative to show a more complete plot is to evaluate all possible combinations of \(E_0\) and \(\sigma_E\) to evaluate the probability of default function. If we do that, we will have 50 values for \(E_0\) and \(\sigma_E\) that represent the \(x\) and \(y\) axis, and a \(50 \times 50\) matrix containing the probability of default. This allows us to plot a surface plot and a contour plot.
# Create the empty matrix.
p_E_se <- matrix(0, nrow = l, ncol = l)
# Fill the empty matrix with probability of default values.
for(i in 1 : l){ # Is there an easier way to do this?
for(j in 1 : l){
p_E_se[i,j] <- mapply(pd, x.E[i], x.se[j], rf, TT, D) } }
Let’s plot the results.
# Plot results.
p.ED1 <- persp(x.E, x.se, p_E_se, zlab = "pd",
xlab = "Equity at time zero",
ylab = "Volatility of equity", theta = 60, phi = 10,
expand = 0.5, col = "orange", shade = 0.2,
ticktype = "detailed")
# Add the original pd value as in the textbook example.
points(trans3d(E0, se, pd1, p.ED1), cex = 2, pch = 19, col = "blue")
Figure 2.22: The probability of default as a function of the equity at time zero and the volatility of equity: A plane or surface view.
Now the plot becomes a plane. A contour plot simplify the reading of a three dimension plot by reducing it into two dimensions: \(x\) and \(y\) coordinates. The value of the probability of default is represented by the contour lines.
contour(x.E, x.se, p_E_se, xlab = "Equity at time zero",
ylab = "Volatility of equity", lwd = 2)
points(E0, se, pch = 19, col = "blue", cex = 2)
Figure 2.23: The probability of default as a function of the equity at time zero and the volatility of equity: A contour view.
The blue point is the original case. Note that the value is slightly higher than 0.1, which is consistent with the \(pd=0.1269396\).
l = 40 # Vectors of length 50.
# Create vectors of the parameters.
x.E <- seq(from = 2, to = 4, length.out = l)
x.se <- seq(0.2, 1.5, length.out = l)
# Create the empty matrix.
p_E_se <- matrix(0, nrow = l, ncol = l)
# Fill the empty matrix with probability of default values.
for(i in 1 : l){ # Is there an easier way to do this?
for(j in 1 : l){
p_E_se[i,j] <- mapply(pd, x.E[i], x.se[j], rf, TT, D) } }
plot_ly(type = "surface" , x = x.se, y = x.E , z = p_E_se ) %>%
layout(#title = "Minimization of f at V0=12.395 an sv=0.212",
scene = list(xaxis = list(title = "Volatility of equity"),
yaxis = list(title = "Market value of equity"),
zaxis = list(title = "Probability of default"))) %>%
hide_colorbar()
Figure 2.24: The probability of default as a function of the equity at time zero and the volatility of equity: An interactive view.
#add_markers(y = 3, x = 0.8, z = 0.1269396, inherit = TRUE)
The problem. Daenerys Targaryen owns a big manufacturing firm that produces fire extinguishers. Surprisingly, the firm data corresponds exactly as the data in the example 24.3 ((Hull 2015)). She is planning to ask the Iron Bank of Braavos for a loan using the peaceful civilized way (without dragons). However, she would like to reduce the probability of default of the firm first, in that way she might negotiate better credit conditions. Daenerys has some understanding about very basic finance because she knows that she could either reduce the debt, or increase the capital in order to reduce the probability of default. Doing both alternatives at the same time is clearly more expensive so she is not very keen about it.
What is the best strategy to reduce the probability of default? Reduce the debt by 2 or increase the capital by 2?
D.seq <- seq(from = 0, to = 13, by = 0.1)
# Evaluate pd function: E0 changes from 1 to 5; debt goes from 0 to 13.
E1 <- mapply(pd, 1, se, rf, TT, D.seq)
E2 <- mapply(pd, 2, se, rf, TT, D.seq)
E3 <- mapply(pd, 3, se, rf, TT, D.seq)
E4 <- mapply(pd, 4, se, rf, TT, D.seq)
E5 <- mapply(pd, 5, se, rf, TT, D.seq)
Plot the results.
# We use these vectors for colors and legends in the following plots.
colors <- c("green", "purple", "black", "blue", "red")
leg <- c("E0=1", "E0=2", "E0=3 (initial value)", "E0=4", "E0=5")
# Plot.
plot(D.seq, E1, type = "l", col = "green", lwd = 3,
xlab = "Debt value. Initially, D=10",
ylab = "Probability of default. Initially PD=12.69%")
lines(D.seq, E2, col = "purple", lwd = 3)
lines(D.seq, E3, col = "black", lwd = 3)
lines(D.seq, E4, col = "blue", lwd = 3)
lines(D.seq, E5, col = "red", lwd = 3)
abline(v = D, lty = 2)
abline(h = pd1, lty = 2)
legend("bottomright", legend = leg, lwd = rep(3, 5), col = colors,
bg = "white")
points(D, pd1, pch = 19, cex = 2)
Figure 2.25: GoT problem. Probability of default and capital structure.
Let’s illustrate the alternatives as clear as possible.
plot(D.seq, E1, type = "l", col = "green", lwd = 3,
xlab = "Debt value. Initially, D=10",
ylab = "Probability of default. Initially PD=12.69%")
lines(D.seq, E2, col = "purple", lwd = 3)
lines(D.seq, E3, col = "black", lwd = 3)
lines(D.seq, E4, col = "blue", lwd = 3)
lines(D.seq, E5, col = "red", lwd = 3)
abline(v = D, lty = 2)
abline(v = 8, lty = 2)
abline(h = pd1, lty = 2)
abline(h = pd(E0, se, rf, TT, 8), lty = 2)
abline(h = pd(5, se, rf, TT, D), lty = 2)
legend("bottomright", legend = leg, lwd = rep(3, 5), col = colors,
bg = "white")
points(D, pd1, pch = 19, cex = 2)
points(D, pd(5, se, rf, TT, D), pch = 19, cex = 2, col = "red")
points(8, pd(3, se, rf, TT, 8), pch = 19, cex = 2, col = "yellow")
Figure 2.26: GoT solution. Probability of default and capital structure.
Consider a different initial situation. Now, \(E_0=2\) and \(D=2\) and the rest remains the same. In this case, the probability of default is now 7.13%. What is the best strategy to reduce the probability of default? Reduce the debt by 1 or increase the capital by 1?
leg2 <- c("E0=1", "E0=2 (initial value)", "E0=3", "E0=4", "E0=5")
plot(D.seq, E1, type = "l", col = "green", lwd = 3,
xlab = "Debt value. Initially, D=2",
ylab = "Probability of default. Initially PD=7.13%",
xlim = c(0, 2.5), ylim = c(0, 0.08))
lines(D.seq, E2, col = "purple", lwd = 3)
lines(D.seq, E3, col = "black", lwd = 3)
lines(D.seq, E4, col = "blue", lwd = 3)
lines(D.seq, E5, col = "red", lwd = 3)
abline(v = 2, lty = 2)
abline(h = pd(2, se, rf, TT, 2), lty = 2)
legend("bottomright", legend = leg2, lwd = rep(3, 5), col = colors,
bg = "white", cex = 0.6)
points(2, pd(2, se, rf, TT, 2), pch = 19, cex = 2, col = "purple")
Figure 2.27: GoT problem 2. Probability of default and capital structure.
The solution can be illustrated as follows:
plot(D.seq, E1, type = "l", col = "green", lwd = 3,
xlab = "Debt value. Initially, D=2",
ylab = "Probability of default. Initially PD=7.13%",
xlim = c(0, 2.5), ylim = c(0, 0.08))
lines(D.seq, E2, col = "purple", lwd = 3)
lines(D.seq, E3, col = "black", lwd = 3)
lines(D.seq, E4, col = "blue", lwd = 3)
lines(D.seq, E5, col = "red", lwd = 3)
abline(v = 2, lty = 2)
abline(v = 1, lty = 2)
abline(h = pd(2, se, rf, TT, 2), lty = 2)
abline(h = pd(2, se, rf, TT, 1), lty = 2)
abline(h = pd(E0, se, rf, TT, 2), lty = 2)
legend("bottomright", legend = leg2, lwd = rep(3, 5), col = colors,
bg = "white", cex = 0.6)
points(2, pd(2, se, rf, TT, 2), pch = 19, cex = 2, col = "purple")
points(1, pd(2, se, rf, TT, 1), pch = 19, cex = 2, col = "yellow")
points(2, pd(E0, se, rf, TT, 2), pch = 19, cex = 2)
Figure 2.28: GoT solution 2. Probability of default and capital structure.
Interesting.
Consider a third problem. In a parallel universe, Daenerys Targaryen’s firm has some liquidity short-term troubles. She needs either more cash now or some more time to pay her current debt. She would like to know which alternative leads to the lowest increase of the probability of default. Add $2 more to her current debt, or ask for a half year more time to pay her current debt?
T1 <- mapply(pd, E0, se, rf, 0.5, D.seq)
T2 <- mapply(pd, E0, se, rf, 1, D.seq)
T3 <- mapply(pd, E0, se, rf, 1.5, D.seq)
T4 <- mapply(pd, E0, se, rf, 2, D.seq)
T5 <- mapply(pd, E0, se, rf, 2.5, D.seq)
leg3 <- c("T=2.5", "T=2", "T=1.5", "T=1", "T=0.5")
plot(D.seq, T5, type = "l", col = "green", lwd = 3,
xlab = "Debt value. Initially, D=6",
ylab = "Probability of default. Initially PD=20.33%")
lines(D.seq, T4, col = "purple", lwd = 3)
lines(D.seq, T3, col = "black", lwd = 3)
lines(D.seq, T2, col = "blue", lwd = 3)
lines(D.seq, T1, col = "red", lwd = 3)
abline(v = 6, lty = 2)
abline(h = pd(E0, se, rf, 1.5, 6), lty = 2)
legend("bottomright", legend = leg3, lwd = rep(3, 5), col = colors,
bg = "white")
points(6, pd(E0, se, rf, 1.5, 6), pch = 19, cex = 2)
Figure 2.29: GoT problem 3. More debt or longer maturity?
Let’s visualize the result.
plot(D.seq, T5, type = "l", col = "green", lwd = 3,
xlab = "Debt value. Initially, D=6",
ylab = "Probability of default. Initially PD=20.33%")
lines(D.seq, T4, col = "purple", lwd = 3)
lines(D.seq, T3, col = "black", lwd = 3)
lines(D.seq, T2, col = "blue", lwd = 3)
lines(D.seq, T1, col = "red", lwd = 3)
abline(v = 6, lty = 2)
abline(v = 8, lty = 2)
abline(h = pd(E0, se, rf, 1.5, 6), lty = 2)
abline(h = pd(E0, se, rf, 2, 6), lty = 2)
abline(h = pd(E0, se, rf, 1.5, 6), lty = 2)
abline(h = pd(E0, se, rf, 1.5, 8), lty = 2)
legend("bottomright", legend = leg3, lwd = rep(3, 5), col = colors,
bg = "white")
points(6, pd(E0, se, rf, 1.5, 6), pch = 19, cex = 2)
points(8, pd(E0, se, rf, 1.5, 8), pch = 19, cex = 2, col = "yellow")
points(6, pd(E0, se, rf, 2, 6), pch = 19, cex = 2, col = "purple")
Figure 2.30: GoT answer 3. More debt or longer maturity?
Nice.
# Copula models
library(MASS)
library(knitr)
library(viridis)
library(dplyr)
library(mnormt) #dnorm()
library(ggplot2)
library(plotly)
library(rayshader)
library(vembedr)
Copulas allow us to decompose a joint probability distribution into their marginals (which by definition have no correlation) and a function which couples (hence the name) them together and thus allows us to specify the correlation separately. The copula is that coupling function. Here, we introduce the simplest type of copulas into a very common problem in credit risk which is the time to default.
In a finance-context, the variable \(x\) represents a firm’s future (unknown) performance measure that goes from \(-4\) to \(4\) in the horizontal axis. Strictly speaking, more extreme values of \(x\) like \(-\infty\) and \(+\infty\) are theoretically possible but are very rare and happen quite infrequently at least in real-life situations. At the moment, we do not care too much about what kind of performance measure this is, it could be liquidity for example, solvency, or any other that it is normalized to have values between \(-4\) and \(4\). In a statistics-context, we can think that \(x\) is equivalent to the so-called \(z\)-score in the context of the standardized normal distribution function.
# The standard normal distribution function is: y = f(x).
x <- seq(-4, 4, 0.001) # First define x.
y <- 1 / sqrt(2 * pi) * exp(-x ^ 2 / 2) # Define y as a function of x.
# Now plot.
plot(x, y, type = "l", lwd = 3, col = "blue" ,
ylab = "Standard normal density function: dnorm(x)")
abline(h = 0, lty = 2)
Figure 3.1: Standard normal distribution function.
The copula model, including the Gaussian, considers that the future firm performance measure \(x\) in the horizontal axis is related with the firm’s probability of default (from 0 to 1, or 0% to 100%) represented by the value of the cumulative density function pnorm() of the normal distribution.
Assume the firm will default in six months as long as the firm performance \(x\) in six months is lower than a given threshold: \(\bar{x}=-2.5\). The threshold \(\bar{x}=-2.5\) in the Gaussian copula model is equivalent to the \(-d_2\) in the Merton model. Then, the six months firm’s probability of default in math notation is \(N(-2.5)\), or pnorm(-2.5)=0.006209665 in R code, which is very low as it is close to 0. It is the red area below, which is 0.62% of the 100% possible area under the curve.
# Useful to construct next plots.
D <- dnorm(x)
P <- pnorm(x)
plot(x, D, type = "l", lwd = 3, col = "blue" ,
ylab = "Standard normal density function: dnorm(x)")
abline(h = 0, lty = 2)
abline(v = -2.5, lty = 2)
polygon(c(x[x < -2.5], -2.5), c(D[x < -2.5], 0), col = "red")
Figure 3.2: Six months probability of default.
Now assume the firm will default in one year as long as the firm performance \(x\) in one year is lower than a higher given threshold: \(\bar{x}=-2\). Then, the one year probability of default is \(N(-2)\) or pnorm(-2)=0.02275013 which is still low but higher than the six months probability of default. Note that the new red area is 2.27%.
plot(x, D, type = "l", lwd = 3, col = "blue" ,
ylab = "Standard normal density function: dnorm(x)")
abline(h = 0, lty = 2)
abline(v = -2, lty = 2)
polygon(c(x[x < -2], -2), c(D[x < -2], 0), col = "red")
Figure 3.3: One year probability of default.
At the moment, we have not discussed how to estimate the threshold values \(\bar{x}\). However, we can say that the Gaussian copula model assumes that the probability of default increases as time passes. As if the firm will eventually default in the future. The six month probability of default 0.62% is low because it is not very likely that the future firm performance lies below -2.5 in six months. The one year probability of default 2.27% is a bit higher because it is more likely that the future firm performance lies below -2 in one year.
Then, the pnorm() R function allows us to transform threshold values \(\bar{x}\) into a probabilities of default. Transform probabilities into \(\bar{x}\) is also possible as the function qnorm() is the inverse of pnorm().
From \(\bar{x}\) to probabilities:
pnorm(-2.5)
## [1] 0.006209665
pnorm(-2)
## [1] 0.02275013
From probabilities to \(\bar{x}\):
qnorm(0.006209665)
## [1] -2.5
qnorm(0.02275013)
## [1] -2
Nice.
The function dnorm() is relevant when we implement a graphical approach because it represents how frequent (or how likely) these values are given the standard normal distribution, so in both extreme values of \(x\) the value of dnorm() is low. Here, extreme values of \(x\) can be represented by \(-4\) and \(4\). This function delivers the height of the standard normal distribution, dnorm(4)=0.0001338302, and dnorm(-4)=0.0001338302. Given that the standard normal function is symmetrical, we have that in general: dnorm(-x)=dnorm(x). The letter \(d\) in dnorm() stands for density and it is maximum at \(x=0\). When plotting the density values we get the bell-shaped normal curve.
See how these pnorm(), qnorm(), dnorm() R functions work and relate:
# Here, x has 11 values only.
x.summary <- c(-Inf, -4:4, Inf) # vector of x values to evaluate functions.
ans <- data.frame(x.summary, dnorm(x.summary),
pnorm(x.summary), qnorm(pnorm(x.summary)))
colnames(ans) <- c("x", "dnorm(x)=height", "pnorm(x)=pd", "qnorm(pd)=x")
kable(ans, caption = "Review of standard normal distribution functions.",
digits = 5)
| x | dnorm(x)=height | pnorm(x)=pd | qnorm(pd)=x |
|---|---|---|---|
| -Inf | 0.00000 | 0.00000 | -Inf |
| -4 | 0.00013 | 0.00003 | -4 |
| -3 | 0.00443 | 0.00135 | -3 |
| -2 | 0.05399 | 0.02275 | -2 |
| -1 | 0.24197 | 0.15866 | -1 |
| 0 | 0.39894 | 0.50000 | 0 |
| 1 | 0.24197 | 0.84134 | 1 |
| 2 | 0.05399 | 0.97725 | 2 |
| 3 | 0.00443 | 0.99865 | 3 |
| 4 | 0.00013 | 0.99997 | 4 |
| Inf | 0.00000 | 1.00000 | Inf |
You can type for example ?dnorm in the RStudio console to see more details about this (and other) functions.
Note that the only distribution function that we are currently analyzing is the standard normal (Gaussian) distribution function. A standard normal distribution function is the one that has mean 0 and variance 1. There are other copula models that assume other kinds of distributions. These other kinds of copulas are useful when we are interested in modeling cases in which extreme values are more likely to happen compared with the standard normal distribution. Understanding Gaussian copulas allows you to understand other more elaborated copulas.
Instead of a density function, we can plot the cumulative probability distribution. Now, we do not need the \(N(\cdot)\) function to compute the probability as the vertical axis already represents the probability of default.
plot(x, P, type = "l", lwd = 3,
ylab = "Cumulative probability function: pnorm(x)", xlab = "x")
abline(h = 0.5, lty = 2)
abline(v = 0, lty = 2)
Figure 3.4: The higher the x, the higher the probability of default.
Both in the same plot.
plot(x, P, type = "l", lwd = 3,
ylab = "pnorm(x) and dnorm(x)", xlab = "x")
lines(x, D, type = "l", lwd = 3, col = "blue")
abline(h = 0.5, lty = 2)
abline(v = 0, lty = 2)
Figure 3.5: Cumulative density and probability functions.
Here, we start analyzing Hull (2015) example 24.6. We do not analyze the 10 firms yet as in the original example. Instead, we first make some sense about the data, the relevant analysis, the logic, and the model basics before dealing with the full features in example in Hull (2015) 24.6.
Recall that, pnorm() leads to a probability, whereas qnorm() leads to a value of \(x\). According to Hull (2015), the cumulative probabilities of default of 1%, 3%, 6%, 10% and 15% are taken as given for the maturities of 1, 2, 3, 4 and 5 years respectively. To implement the model, we take this information to derive the corresponding threshold values \(\bar{x}\):
pd <- c(0.01, 0.03, 0.06, 0.1, 0.15) # Probabilities of default per year.
x.y <- qnorm(pd) # Transform probabilities of default into x values.
ans <- data.frame(1:5, pd, x.y) # Gather the results.
colnames(ans) <- c("year", "pd", "x (values given in Hull)")
kable(ans, caption = "Main parameters.")
| year | pd | x (values given in Hull) |
|---|---|---|
| 1 | 0.01 | -2.326348 |
| 2 | 0.03 | -1.880794 |
| 3 | 0.06 | -1.554774 |
| 4 | 0.10 | -1.281552 |
| 5 | 0.15 | -1.036433 |
Note that the example assumes that the probability of default increases as we consider a longer maturity (from 1 year to 5 years). Probabilities above are cumulative, so they go from year 0 to year 1, from year 0 to year 2 and so on. To calculate the probability of default during a specific year we need to calculate the differences. In particular:
ans <- data.frame(diff(pd))
colnames(ans) <- c("PD")
rownames(ans) <- c("x.y1 to x.y2", "x.y2 to x.y3",
"x.y3 to x.y4", "x.y4 to x.y5")
kable(ans, caption = "PD at specific years.")
| PD | |
|---|---|
| x.y1 to x.y2 | 0.02 |
| x.y2 to x.y3 | 0.03 |
| x.y3 to x.y4 | 0.04 |
| x.y4 to x.y5 | 0.05 |
This is how we can illustrate the case of a probability of default of 15% in 5 years. The green area represents the 15% of the whole area below the bell-shaped curve.
plot(x, D, type = "l", col = 'black', lwd = 3, ylim = c(0, 0.4), xlab = "x",
ylab = "Standard normal density function: dnorm(x)")
abline(h = 0, lty = 2)
abline(v = x.y[5], lty = 2)
polygon(c(x[x < x.y[5]], x.y[5]),
c(D[x < x.y[5]], 0), col = "green")
legend("topright", legend=c(
"pd(5years)=15%
The left green area
represents 15% of the
whole bell-shape.
N^-1(0.15)=-1.036433
N(-1.036433)=0.15"),
bg = "white", pch = 19, cex = 0.8, bty = "n", col = "green")
Figure 3.6: Gaussian density distribution function.
A complementary view is the cumulative probability function. Let’s illustrate the same case: a probability of default of 15% in 5 years. In this case, we do not need to calculate the area since the y-axis already represents the probability.
plot(x, P, type = "l", col = 'black', lwd = 3, ylim = c(0, 1),
xlab = "x",
ylab = "Cumulative probability function: pnorm(x)")
abline(h = 0, lty = 2)
abline(h = 1, lty = 2)
lines(seq(-5, x.y[5], length.out = 2), rep(pnorm(x.y[5]), 2),
col = "green", lwd = 3)
lines(rep(x.y[5], 2), seq(0, pnorm(x.y[5]), length.out = 2),
col = "green", lwd = 3)
points(x.y[5], 0.15, pch = 19, col = "green", cex = 2)
legend("right", legend=c(
"pd(5years)=15%
N^-1(0.15)=-1.036433
N(-1.036433)=0.15"),
bg = "white", pch = 19, cex = 1, bty = "n", col = "green")
Figure 3.7: Gaussian probability distribution function.
A closer view to the figure above to see the 3-year and 5-year cases:
plot(x, P, type = "l", col = 'black', lwd = 5, ylim = c(0, 0.22),
xlim = c(-4, 0), ylab = "Cumulative probability function: pnorm(x)",
xlab = "x")
abline(h = 0, lty = 2)
abline(h = 1, lty = 2)
lines(seq(-5, x.y[3], length.out = 2), rep(pnorm(x.y[3]), 2),
col = "purple", lwd = 3, lty = 2)
lines(rep(x.y[3], 2), seq(0, pnorm(x.y[3]), length.out = 2),
col = "purple", lwd = 3, lty = 2)
lines(seq(-5, x.y[5], length.out = 2), rep(pnorm(x.y[5]), 2),
col = "green", lwd = 3, lty = 2)
lines(rep(x.y[5], 2), seq(0, pnorm(x.y[5]), length.out = 2),
col = "green", lwd = 3, lty = 2)
points(x.y[3], 0.06, pch = 19, col = "purple", cex = 2)
points(x.y[5], 0.15, pch = 19, col = "green", cex = 2)
legend("topleft", legend=c("pd(5years)=15%: N(-1.036433)=0.15",
"pd(3years)=6%: N(-1.554774)=0.06"),
pch = 19, col = c("green", "purple"), bg = "white", cex = 1, bty = "n")
Figure 3.8: Gaussian probability distribution function: Zoom version.
Nice.
Now let’s introduce a simulation approach. This is, instead of generating continuous values of \(x\) from \(-4\) to \(4\), we simulate many \(x\) values (10,000 in this case) that follow a standard normal distribution function using the rnorm() function. The simulation approach is useful especially when we are interested in replicating what happens in real-life situations because we can artificially replicate the observed behavior many times and understand what may happen in the future. In other words, x was used before to characterize a perfect normal distribution. Now, we incorporate x.sim that behaves as a normal distribution and allows some randomness just as in real life situations.
See the difference between both approaches to generate \(x\).
N <- 10000 # Number of simulated values.
set.seed(130575) # Reproducibility.
x.sim <- rnorm(N, 0, 1) # Simulation.
kable(head(cbind(x.sim, x)))
| x.sim | x |
|---|---|
| -1.7939058 | -4.000 |
| -0.7515449 | -3.999 |
| -0.3569900 | -3.998 |
| 0.9629299 | -3.997 |
| 0.3680735 | -3.996 |
| 1.0714250 | -3.995 |
Note that the probabilities of default per maturity in the simulated approach are close to the values of the previous section. In particular, \(0.01\) is equivalent to \(0.0102\), and \(0.03\) is equivalent to \(0.0307\). They do not match exactly simply because we are comparing theoretical versus simulated probabilities.
# Function to calculate proportions that we understand as probabilities.
prop <- function(x) {
ans <- length(x.sim[x.sim <= x]) / N
}
pd.sim <- mapply(prop, x.y) # Apply the function.
ans <- data.frame(x.y, pd, pd.sim)
colnames(ans) <- c("x", "pd.theo", "pd.sim")
rownames(ans) <- c("y1", "y2", "y3", "y4", "y5")
kable(ans, caption = "Theoretic versus simulated probabilities of default.")
| x | pd.theo | pd.sim | |
|---|---|---|---|
| y1 | -2.326348 | 0.01 | 0.0102 |
| y2 | -1.880794 | 0.03 | 0.0307 |
| y3 | -1.554774 | 0.06 | 0.0629 |
| y4 | -1.281552 | 0.10 | 0.1008 |
| y5 | -1.036433 | 0.15 | 0.1480 |
Let’s view the results of the simulated approach. First in a histogram.
# Some parameters we need to plot.
L <- c(-4, 4) # axis limits.
colors2 <- c("blue", "red", "purple", "pink", "green")
legend2 = c("pd(1year)=1.02%: x<=-2.326348",
"pd(2years)=3.07%: x<=-1.880794", "pd(3years)=6.29%: x<=-1.554774",
"pd(4years)=10.08%: x<=-1.281552", "pd(5years)=14.8%: x<=-1.036433")
# The histogram.
hist(x.sim, 500, xlim = L, ylim = c(0, 100), main = NULL, xlab = "x.sim")
abline(h = 0, lty = 2)
abline(v = x.y[1], lwd = 3, col = "blue")
abline(v = x.y[2], lwd = 3, col = "red")
abline(v = x.y[3], lwd = 3, col = "purple")
abline(v = x.y[4], lwd = 3, col = "pink")
abline(v = x.y[5], lwd = 3, col = "green")
legend("topright", legend = legend2, bg = "transparent",
text.col = colors2, cex = 0.9)
Figure 3.9: Simulated Gaussian probability distribution function. Somewhat different with respect to the theoretical.
The area at the left hand side of each colored line represents the cumulative probability of default just as we explained before. In the same way, the area between two colored lines represents the probability of default in a specific period of time.
Now let’s see all the simulated data at once.
plot(x.sim, ylab = "x.sim", pch = ".", ylim = c(-4, 4),
xlab = "Number of simulaltions")
abline(h = x.y[1], lwd = 2, col = "blue")
abline(h = x.y[2], lwd = 2, col = "red")
abline(h = x.y[3], lwd = 2, col = "purple")
abline(h = x.y[4], lwd = 2, col = "pink")
abline(h = x.y[5], lwd = 2, col = "green")
abline(v = 0, lty = 2)
abline(v = 10000, lty = 2)
Figure 3.10: An alternative view.
# legend("topright", legend = legend2, bg = "white",
# text.col = colors2, cex = 0.9)
We normally conduct a simulation approach because we might adapt the distribution function parameters to match what we see in the real life situations. This is how we can reproduce what may happen in the future and at the same time allow for some randomness.
In a different context, here we have an old video in which we drop 2,000 blue balls. Balls are dropped randomly given a normal distribution. Note how the bell-shape distribution is formed as we increase the number of balls.
embed_url("https://youtu.be/P1Ky6-a8DE0")
Consider the case in which we have two firms instead of one. The main difference now is that instead of simulating one firm we need two. Moreover, each firm follows a standard normal distribution function and both of them are correlated by a given correlation value so the firms are not independent. If they are not independent, then what happens to one firm has some impact on what happens to the other.
In a different context, here we have an old video in which we drop many blue balls. Balls are dropped randomly given a multi-variate normal distribution. Note how the 3-D bell-shape distribution is formed as we increase the number of balls.
embed_url("https://youtu.be/5iv5Zksf_zo")
The case of two firms is not the one presented in Hull (2015) example but it can help us to visualize how the Gaussian copula approach works in a two-dimension plot where the default correlation is 0.2.
The simulation of both firms’ performance measures is x2.
m <- 2 # number of firms
n <- 10000 # number of simulations
rho_pos02 <- 0.2 # correlation
corr_pos02 <- matrix(rep(rho_pos02, m * m), m, m) # correlation matrix
diag(corr_pos02) <- 1
set.seed(130575)
x2 <- mvrnorm(n, mu = rep(0, m), Sigma = corr_pos02)
x2 <- data.frame(x2)
colnames(x2) <- c("f1", "f2")
The matrix x2 length is 10,000 for each firm. In other words, we have 10,000 observations of the performance measure or \(z\)-score for two firms that are related. This matrix is big, but we can visualize the header (the first six observations).
kable(head(x2), caption = "Firms' performance x2.", row.names = TRUE)
| f1 | f2 | |
|---|---|---|
| 1 | 0.0880039 | -2.8671108 |
| 2 | 0.0092849 | -1.1735732 |
| 3 | 0.0323809 | -0.5854275 |
| 4 | 1.5547886 | -0.0630241 |
| 5 | 0.9679767 | -0.3977597 |
| 6 | 0.7751329 | 0.8847116 |
Remember the cumulative probabilities of default are 1%, 3%, 6%, 10% and 15% for the maturities of 1, 2, 3, 4 and 5 years respectively. How do we extract those cases in which both firms will default in 5 years? In rows 15, 53, 61 and so on both firms default at the same time. Note that in all cases the \(x\) values are indeed below -1.036433.
# These names are going to be useful later.
n.year <- c("year 1", "year 2", "year 3", "year 4", "year 5")
n.pd <- c("pd.y1", "pd.y2", "pd.y3", "pd.y4", "pd.y5")
n.f <- c("f1", "f2", "f1 default?", "f2 default?")
# Function to calculate cases in which firms default and probabilities.
fun.X <- function(x) {
both <- x2[x2$f2 < x & x2$f1 < x, ] # both default.
any <- x2[x2$f2 < x | x2$f1 < x, ] # at least one firm default.
onlyf1 <- x2[x2$f2 > x & x2$f1 < x, ] # only firm 1 default.
onlyf2 <- x2[x2$f1 > x & x2$f2 < x, ] # only firm 2 default.
one <- x2[(x2$f2 < x & x2$f1 > x | # only one firm default.
x2$f2 > x & x2$f1 < x),]
none <- x2[x2$f2 > x & x2$f1 > x, ] # no firm default.
# Gather results and probabilities in a list.
ans <- list(both = both, any = any, onlyf1 = onlyf1,
onlyf2 = onlyf2, one = one, none = none,
both.pd = (nrow(both) / n), any.pd = (nrow(any) / n),
onlyf1.pd = (nrow(onlyf1) / n), onlyf2.pd = (nrow(onlyf2)/ n),
one.pd = (nrow(one) / n), none.pd = (nrow(none) / n))
}
# X has all the relevant results for x2.
X <- mapply(fun.X, x.y)
See the cases in which both firms default in 5 years.
# Extract "both" cases, year 5.
both <- data.frame(X[["both", 5]], X[["both", 5]] < x.y[5])
colnames(both) <- n.f
kable(head(both), caption = "Cases in which both firms default in 5 years.",
row.names = TRUE)
| f1 | f2 | f1 default? | f2 default? | |
|---|---|---|---|---|
| 15 | -1.123330 | -1.783569 | TRUE | TRUE |
| 53 | -2.724912 | -1.235614 | TRUE | TRUE |
| 61 | -1.968789 | -1.516563 | TRUE | TRUE |
| 108 | -2.121617 | -1.807062 | TRUE | TRUE |
| 156 | -1.559930 | -1.132879 | TRUE | TRUE |
| 157 | -1.194204 | -2.120824 | TRUE | TRUE |
In total, we have 343 cases in which both firms default at the same time. The first case is number 15, the second 53, the third 61 and so on. It is easy to know the total cases if we count the number of rows.
How do we extract those cases in which any firm will default in 5 years? This is, only firm 1, only firm 2, or both at the same time. This is a less strict condition so we would expect to have more cases to match this new criteria compared with both. Note that in all cases at least one one firm is indeed below -1.036433. In row 1, 2 and 10 firm 2 defaults. In row 15 both firms default. In row 18 and 20 firm 1 and firm 2 default respectively.
any <- data.frame(X[["any", 5]], X[["any", 5]] < x.y[5])
colnames(any) <- n.f
kable(head(any), caption = "Cases in which at least one firm default.",
row.names = TRUE)
| f1 | f2 | f1 default? | f2 default? | |
|---|---|---|---|---|
| 1 | 0.0880039 | -2.8671108 | FALSE | TRUE |
| 2 | 0.0092849 | -1.1735732 | FALSE | TRUE |
| 10 | 1.4475216 | -1.1512274 | FALSE | TRUE |
| 15 | -1.1233296 | -1.7835692 | TRUE | TRUE |
| 18 | -1.5829965 | -0.7174298 | TRUE | FALSE |
| 20 | -0.9221666 | -1.9982869 | FALSE | TRUE |
Now we have 2,660 cases. Considerably more as the | restriction is less strict than the &. Let’s see the cases in which only firm 1 defaults.
onlyf1 <- data.frame(X[["onlyf1", 5]], X[["onlyf1", 5]] < x.y[5])
colnames(onlyf1) <- n.f
kable(head(onlyf1), caption = "Cases in which only firm 1 default.")
| f1 | f2 | f1 default? | f2 default? | |
|---|---|---|---|---|
| 18 | -1.582996 | -0.7174298 | TRUE | FALSE |
| 34 | -2.842355 | 1.5400020 | TRUE | FALSE |
| 35 | -2.090320 | 0.7431275 | TRUE | FALSE |
| 36 | -1.600142 | 0.3082653 | TRUE | FALSE |
| 41 | -2.193496 | -0.8465008 | TRUE | FALSE |
| 49 | -1.764711 | -0.3914202 | TRUE | FALSE |
Only firm 2 defaults.
onlyf2 <- data.frame(X[["onlyf2", 5]], X[["onlyf2", 5]] < x.y[5])
colnames(onlyf2) <- n.f
kable(head(onlyf2), caption = "Cases in which only firm 2 default.")
| f1 | f2 | f1 default? | f2 default? | |
|---|---|---|---|---|
| 1 | 0.0880039 | -2.867111 | FALSE | TRUE |
| 2 | 0.0092849 | -1.173573 | FALSE | TRUE |
| 10 | 1.4475216 | -1.151227 | FALSE | TRUE |
| 20 | -0.9221666 | -1.998287 | FALSE | TRUE |
| 22 | -0.1110555 | -1.348605 | FALSE | TRUE |
| 40 | 0.1058949 | -1.752620 | FALSE | TRUE |
Only one firm default at year 5.
one <- data.frame(X[["one", 5]], X[["one", 5]] < x.y[5])
colnames(one) <- n.f
kable(head(one), caption = "Cases in which only one firm default.")
| f1 | f2 | f1 default? | f2 default? | |
|---|---|---|---|---|
| 1 | 0.0880039 | -2.8671108 | FALSE | TRUE |
| 2 | 0.0092849 | -1.1735732 | FALSE | TRUE |
| 10 | 1.4475216 | -1.1512274 | FALSE | TRUE |
| 18 | -1.5829965 | -0.7174298 | TRUE | FALSE |
| 20 | -0.9221666 | -1.9982869 | FALSE | TRUE |
| 22 | -0.1110555 | -1.3486051 | FALSE | TRUE |
Lastly, when no firm defaults.
none <- data.frame(X[["none", 5]], X[["none", 5]] < x.y[5])
colnames(none) <- n.f
kable(head(none), caption = "Cases in which no firm default.")
| f1 | f2 | f1 default? | f2 default? | |
|---|---|---|---|---|
| 3 | 0.0323809 | -0.5854275 | FALSE | FALSE |
| 4 | 1.5547886 | -0.0630241 | FALSE | FALSE |
| 5 | 0.9679767 | -0.3977597 | FALSE | FALSE |
| 6 | 0.7751329 | 0.8847116 | FALSE | FALSE |
| 7 | 1.3361498 | 1.0392145 | FALSE | FALSE |
| 8 | 1.3039572 | 1.5191303 | FALSE | FALSE |
Finally, probabilities.
both.pd <- t(data.frame(X["both.pd",]))
any.pd <- t(data.frame(X["any.pd",]))
onlyf1.pd <- t(data.frame(X["onlyf1.pd",]))
onlyf2.pd <- t(data.frame(X["onlyf2.pd",]))
one.pd <- t(data.frame(X["one.pd",]))
none.pd <- t(data.frame(X["none.pd",]))
ans <- data.frame(both.pd, any.pd, onlyf1.pd, onlyf2.pd,
one.pd, none.pd)
rownames(ans) <- n.year
kable(ans, caption = "Probabilities of default.")
| both.pd | any.pd | onlyf1.pd | onlyf2.pd | one.pd | none.pd | |
|---|---|---|---|---|---|---|
| year 1 | 0.0003 | 0.0192 | 0.0103 | 0.0086 | 0.0189 | 0.9808 |
| year 2 | 0.0019 | 0.0560 | 0.0282 | 0.0259 | 0.0541 | 0.9440 |
| year 3 | 0.0066 | 0.1141 | 0.0561 | 0.0514 | 0.1075 | 0.8859 |
| year 4 | 0.0176 | 0.1854 | 0.0864 | 0.0814 | 0.1678 | 0.8146 |
| year 5 | 0.0343 | 0.2660 | 0.1210 | 0.1107 | 0.2317 | 0.7340 |
So interesting.
Note that 15% is the theoretical probability that one firm will default in 5 years. Here, this 15% is 12.1% for firm 1 and 11.07% for firm 2 when the data is simulated.
We can perform a simple test to verify that everything is alright. For example, this equation must hold: onlyf1+onlyf2=one. Substituting for the year 5: \(0.121+0.1107=0.2317\). As you can see, everything is alright. This equation must hold as well: any-both=one. Substituting for the year 5: \(0.266-0.0343=0.2317\).
Let’s visualize all 10,000 cases. Each dot represents a couple of Firm 1 and Firm 2 \(x\) values and the dotted lines the threshold that represents the probability of default in 5 years.
par(pty = "s") # Figures are shown in a perfect square (not a rectangle).
plot(x2, pch = ".", cex = 1)
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("bottomright", legend = c(paste(nrow(x2))), bty = "n")
Figure 3.11: All 10,000 simulated cases.
These 10,000 observations are highly concentrated around the mean which is very close to zero. This can be also easily seen in the following density plot.
ggplot(x2, aes(x = f1, y = f2) ) +
stat_density_2d(aes(fill = ..level..), geom = "polygon",
colour = "white") +
coord_fixed()
Figure 3.12: All 10,000 simulated cases: A density view.
This is the 3-D rayshader video version of the plot above.
embed_url("https://youtu.be/3p-B19R8RwA")
And a third interactive version.
f <- function(x, y) dmnorm(cbind(x, y), c(0, 0), corr_pos02)
z <- outer(sort(x2[1:100,1]), sort(x2[1:100,2]), f)
#create contour plot
#contour(x2[1:100,1], x2[1:100,2], z)
plot_ly(type = "surface" , x = sort(x2[1:100,2]),
y = sort(x2[1:100,1]) , z = z ) %>%
layout(#title = "Surface",
scene = list(xaxis = list(title = "f1", range = c(-2,2)),
yaxis = list(title = "f2", range = c(-2,2)),
zaxis = list(title = "Density"))) %>%
hide_colorbar()
Figure 3.13: An interactive view.
We can visualize the default cases. First, the case when both firms default at year 5.
par(pty = "s")
plot(X[["both", 5]], xlim = L, ylim = L, pch = ".", cex = 1)
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("topright", legend = c(paste(both.pd[5]*n)), bty = "n")
Figure 3.14: Both firms default at year 5.
The values within the plot represent the number of cases. Here, we have 343 times (out of 10,000) in which both firms default at the same time in 5 years. Note that this is a cumulative probability of default.
Now, the case in which any firm defaults in 5 years.
par(pty = "s")
plot(X[["any", 5]], xlim = L, ylim = L, pch = ".", cex = 1)
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("topright", legend = c(paste(any.pd[5]*n)), bty = "n")
Figure 3.15: Any firm defaults in 5 years.
Only firm 1 defaults in 5 years.
par(pty = "s")
plot(X[["onlyf1", 5]], xlim = L, ylim = L, pch = ".", cex = 1)
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("topright", legend = c(paste(onlyf1.pd[5]*n)), bty = "n")
Figure 3.16: Only firm 1 defaults in 5 years.
Only firm 2 defaults in 5 years.
par(pty = "s")
plot(X[["onlyf2", 5]], xlim = L, ylim = L, pch = ".", cex = 1)
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("topright", legend = c(paste(onlyf2.pd[5]*n)), bty = "n")
Figure 3.17: Only firm 2 defaults in 5 years.
This is the case in which only one firm defaults.
par(pty = "s")
plot(X[["one", 5]], xlim = L, ylim = L, pch = ".", cex = 1)
abline(h = x.y[5], lty =2)
abline(v = x.y[5], lty =2)
legend("topright", legend = c(paste(one.pd[5]*n)), bty = "n")
Figure 3.18: Only one firm defaults in 5 years.
This is the case in which no one firm defaults.
par(pty = "s")
plot(X[["none", 5]], xlim = L, ylim = L, pch = ".", cex = 1)
abline(h = x.y[5], lty =2)
abline(v = x.y[5], lty =2)
legend("topright", legend = c(paste(none.pd[5]*n)), bty = "n")
Figure 3.19: No firm defaults in 5 years.
It is a good idea to summarize all previous plots in one.
# none
par(mfrow = c(2, 3), oma = c(0, 0, 2, 0))
par(pty = "s")
plot(X[["none", 5]], xlim = L, ylim = L, pch = ".", cex = 1,
main = "None.")
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("bottomright", legend = c(paste(none.pd[5]*n)), bty = "n")
# both
par(pty = "s")
plot(X[["both", 5]], xlim = L, ylim = L, pch = ".", cex = 1,
main = "Both.")
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("topright", legend = c(paste(both.pd[5]*n)), bty = "n")
# any
par(pty = "s")
plot(X[["any", 5]], xlim = L, ylim = L, pch = ".", cex = 1,
main = "Any.")
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("topright", legend = c(paste(any.pd[5]*n)), bty = "n")
# onlyfirm1
par(pty = "s")
plot(X[["onlyf1", 5]], xlim = L, ylim = L, pch = ".", cex = 1,
main = "Only f1.")
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("topright", legend = c(paste(onlyf1.pd[5]*n)), bty = "n")
# onlyfirm2
par(pty = "s")
plot(X[["onlyf2", 5]], xlim = L, ylim = L, pch = ".", cex = 1,
main = "Only f2.")
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("topright", legend = c(paste(onlyf2.pd[5]*n)), bty = "n")
# one
par(pty = "s")
plot(X[["one", 5]], xlim = L, ylim = L, pch = ".", cex = 1,
main = "One.")
abline(h = x.y[5], lty = 2)
abline(v = x.y[5], lty = 2)
legend("topright", legend = c(paste(one.pd[5]*n)), bty = "n")
Figure 3.20: Which firm defaults at year 5?
It is interesting to compare two different correlation values. Here, we compare 0 versus 0.2.
m <- 2 # number of firms
n <- 10000 # number of simulations
rho_pos02 <- 0 # correlation
corr_pos02 <- matrix(rep(rho_pos02, m * m), m, m) # correlation matrix
diag(corr_pos02) <- 1
set.seed(130575)
X0 <- mvrnorm(n, mu = rep(0, m), Sigma = corr_pos02)
X0 <- data.frame(X0)
colnames(X0) <- c("f1", "f2")
Just as before, we create a function, evaluate it, and store results in X0.
fun.X0 <- function(x) {
both <- X0[X0$f2 < x & X0$f1 < x, ]
any <- X0[X0$f2 < x | X0$f1 < x, ]
onlyf1 <- X0[X0$f2 > x & X0$f1 < x, ]
onlyf2 <- X0[X0$f1 > x & X0$f2 < x, ]
one <- X0[(X0$f2 < x & X0$f1 > x |
X0$f2 > x & X0$f1 < x),]
none <- X0[X0$f2 > x & X0$f1 > x, ]
ans <- list(both = both, any = any, onlyf1 = onlyf1,
onlyf2 = onlyf2, one = one, none = none,
both.pd = (nrow(both) / n), any.pd = (nrow(any) / n),
onlyf1.pd = (nrow(onlyf1) / n), onlyf2.pd = (nrow(onlyf2) /n),
one.pd = (nrow(one) / n), none.pd = (nrow(none) / n)) }
X0 <- mapply(fun.X0, x.y)
none.pd0 <- data.frame(X0["none.pd",]) * n
both.pd0 <- data.frame(X0["both.pd",]) * n
onlyf1.pd0 <- data.frame(X0["onlyf1.pd",]) * n
onlyf2.pd0 <- data.frame(X0["onlyf2.pd",]) * n
A graphical analysis shows that in the case of 0.2 it is more likely that both firms default at the same time, and it is less likely that any firm default at the same time.
par(mfrow=c(1, 2), oma = c(0, 0, 2, 0))
par(pty = "s")
plot(X[["none", 5]], pch = ".", xlim = L, ylim = L,
cex = 1, main = "Correlation = 0.2")
points(X[["both", 5]], pch = ".", col = "red")
points(X[["onlyf1", 5]], pch = ".", col = "purple")
points(X[["onlyf2", 5]], pch = ".", col = "blue")
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("bottomleft", legend = c(paste(both.pd[5]*n)), bty = "n")
legend("topright", legend = c(paste(none.pd[5]*n)), bty = "n")
legend("topleft", legend = c(paste(onlyf1.pd[5]*n)), bty = "n")
legend("bottomright", legend = c(paste(onlyf2.pd[5]*n)), bty = "n")
par(pty = "s")
plot(X0[["none", 5]], pch = ".", xlim = L, ylim = L,
cex = 1, main = "Correlation = 0.")
points(X0[["both", 5]], pch = ".", col = "red")
points(X0[["onlyf1", 5]], pch = ".", col = "purple")
points(X0[["onlyf2", 5]], pch = ".", col = "blue")
abline(v = x.y[5], lty = 2)
abline(h = x.y[5], lty = 2)
legend("bottomleft", legend = c(paste(both.pd0[5])), bty = "n")
legend("topright", legend = c(paste(none.pd0[5])), bty = "n")
legend("topleft", legend = c(paste(onlyf1.pd0[5])), bty = "n")
legend("bottomright", legend = c(paste(onlyf2.pd0[5])), bty = "n")
Figure 3.21: Cases per quadrant. Dotted lines corresponds to year 5.
par(pty = "s")
Very interesting indeed.
The Gaussian copula model is not the only copula model. See how other distributions work.
t-coupla:
embed_url("https://youtu.be/XZPVnanmEtw")
Gumbel coupla:
embed_url("https://youtu.be/Oq27fu86ahU")
Frank coupla:
embed_url("https://youtu.be/4OOF_FTrZ3w")
Clayton coupla:
embed_url("https://youtu.be/vtzkgnWjFQ4")
The original Hull (2015) example proposes a 10 firm case and here we implement this example following a simulation approach. First, we need a \(10\times10\) correlation matrix to produce the new \(x\) values using a random multi-variate distribution algorithm. According to Hull (2015) example the copula default correlations between each pair of companies is 0.2. The code below has the option to vary the default correlation given a uniform random distribution.
The new \(10\times10\) correlation matrix is then:
# Create the correlation matrix.
m <- 10 # number of firms.
n <- 1000000 # number of simulations per firm.
x <- matrix(rep(0.2, m * m), m, m)
ind <- lower.tri(x)
x[ind] <- t(x)[ind]
diag(x) = 1
kable(x, caption = "Correlation matrix 0.2.")
| 1.0 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 |
| 0.2 | 1.0 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 |
| 0.2 | 0.2 | 1.0 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 |
| 0.2 | 0.2 | 0.2 | 1.0 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 |
| 0.2 | 0.2 | 0.2 | 0.2 | 1.0 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 |
| 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 1.0 | 0.2 | 0.2 | 0.2 | 0.2 |
| 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 1.0 | 0.2 | 0.2 | 0.2 |
| 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 1.0 | 0.2 | 0.2 |
| 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 1.0 | 0.2 |
| 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 1.0 |
Now, we can simulate the multivariate normal distribution. The variable X10 length is 1,000,000 for each firm. We choose a higher number of simulations because otherwise the most restrictive constraints would not lead to any observations.
# Create the simulated cases.
set.seed(130575) # Reproducibility.
X10 <- mvrnorm(n, mu = rep(0, m), Sigma = x)
X10 <- data.frame(X10) # 10,000,000 observations.
colnames(X10) <- c("f1", "f2", "f3", "f4", "f5", "f6", "f7", "f8", "f9", "f10")
kable(head(X10), caption = "10 firms' performance, 1,000,000 simulations.",
digits = 3)
| f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 |
|---|---|---|---|---|---|---|---|---|---|
| 1.541 | 0.747 | 2.270 | 0.546 | 1.500 | 0.106 | 1.587 | 1.269 | 0.453 | -0.527 |
| 1.867 | -0.840 | 0.272 | -0.935 | -0.323 | 0.688 | -0.411 | -0.642 | 2.869 | 1.432 |
| -1.924 | -0.058 | 1.678 | 1.840 | -0.964 | 0.211 | -0.277 | 0.704 | 0.197 | 0.482 |
| -0.451 | -0.997 | -1.471 | -0.565 | 0.227 | 0.429 | -1.809 | -0.783 | -0.659 | 0.983 |
| -1.105 | 0.219 | 0.191 | 1.552 | -1.037 | 0.480 | 1.277 | -1.878 | -1.581 | -0.066 |
| -0.968 | -0.284 | 0.309 | -1.529 | 0.012 | -1.413 | -0.188 | -0.939 | -0.496 | -0.174 |
How do we extract those cases in which all 10 firms will default in 5 years (at the same time)? Here are the first 6 of those cases. Note that in all cases the X10 values are lower than \(-1.036433\).
# Given that we have 10 firms, it is easier to use filter_all function.
# Although probably this could be simplified even further.
y5.all.2 <- filter_all(X10, all_vars(. < x.y[5]))
y4.all.2 <- filter_all(X10, all_vars(. < x.y[4]))
y3.all.2 <- filter_all(X10, all_vars(. < x.y[3]))
y2.all.2 <- filter_all(X10, all_vars(. < x.y[2]))
y1.all.2 <- filter_all(X10, all_vars(. < x.y[1]))
Let’s analyze the case of 5 years.
kable(head(y5.all.2),
caption = "Cases in which all 10 firms default in five years.",
digits = 3)
| f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 |
|---|---|---|---|---|---|---|---|---|---|
| -1.793 | -3.143 | -1.945 | -1.054 | -3.511 | -1.746 | -1.789 | -2.098 | -1.998 | -2.809 |
| -1.423 | -1.779 | -1.209 | -1.486 | -1.304 | -1.630 | -1.455 | -2.358 | -1.238 | -1.070 |
| -2.272 | -1.640 | -2.152 | -2.374 | -1.943 | -3.206 | -1.372 | -1.582 | -2.682 | -3.230 |
| -3.229 | -1.487 | -1.443 | -2.503 | -1.582 | -1.050 | -1.202 | -1.525 | -1.608 | -3.308 |
| -1.172 | -1.079 | -1.508 | -1.605 | -2.093 | -1.407 | -1.972 | -3.347 | -1.878 | -1.624 |
| -2.925 | -2.470 | -1.483 | -1.612 | -1.452 | -1.404 | -2.710 | -2.011 | -1.296 | -2.440 |
kable((head(y5.all.2) < x.y[5]), caption = "Check if all of them default.")
| f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 |
|---|---|---|---|---|---|---|---|---|---|
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
How many cases are there?
nrow(y5.all.2)
## [1] 72
Which are those 72 cases?
# Here, I compare only firm 1 as if firm 1 defaults, then the rest default.
kable(matrix(which(X10[,1] %in% y5.all.2[,1]), 6, 9),
caption = "Which of the 1,000,000 cases represent a default of all
firms in five years?")
| 16257 | 99432 | 171012 | 319557 | 413425 | 511731 | 643935 | 715625 | 796118 |
| 32298 | 101750 | 266761 | 353059 | 423852 | 527750 | 644895 | 727562 | 797265 |
| 57098 | 104810 | 297762 | 391778 | 437629 | 574643 | 647511 | 736041 | 808143 |
| 62188 | 118416 | 300925 | 402689 | 444338 | 597147 | 690749 | 745317 | 808624 |
| 74173 | 125947 | 311908 | 405804 | 492912 | 618391 | 700476 | 747221 | 813924 |
| 81982 | 132862 | 317946 | 406553 | 509996 | 622673 | 705705 | 749223 | 823963 |
In total, we only have 72 cases. This is, the 10 firms will default at the same time in 5 years in 72 out of 1,000,000 total cases. For the rest of the years, the cases are less frequent, in fact we have zero cases for year 1, 2 and 3.
How do we extract those cases in which at least one of the 10 firms will default?
y5.any.2 <- filter_all(X10, any_vars(. < x.y[5]))
y4.any.2 <- filter_all(X10, any_vars(. < x.y[4]))
y3.any.2 <- filter_all(X10, any_vars(. < x.y[3]))
y2.any.2 <- filter_all(X10, any_vars(. < x.y[2]))
y1.any.2 <- filter_all(X10, any_vars(. < x.y[1]))
Here are the first 6 cases for the 5-year default.
kable(head(y5.any.2), caption = "At least one firm default in five years.",
digits = 3)
| f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 |
|---|---|---|---|---|---|---|---|---|---|
| -1.924 | -0.058 | 1.678 | 1.840 | -0.964 | 0.211 | -0.277 | 0.704 | 0.197 | 0.482 |
| -0.451 | -0.997 | -1.471 | -0.565 | 0.227 | 0.429 | -1.809 | -0.783 | -0.659 | 0.983 |
| -1.105 | 0.219 | 0.191 | 1.552 | -1.037 | 0.480 | 1.277 | -1.878 | -1.581 | -0.066 |
| -0.968 | -0.284 | 0.309 | -1.529 | 0.012 | -1.413 | -0.188 | -0.939 | -0.496 | -0.174 |
| -0.836 | -0.828 | 0.982 | 0.375 | -0.341 | -1.382 | -2.533 | -1.268 | -1.214 | -1.069 |
| -0.495 | -1.382 | -0.932 | -0.625 | 1.327 | -2.546 | 0.071 | -2.605 | -1.116 | -1.340 |
kable((head(y5.any.2) < x.y[5]), caption = "Check which one(s) default.")
| f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 |
|---|---|---|---|---|---|---|---|---|---|
| TRUE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE |
| FALSE | FALSE | TRUE | FALSE | FALSE | FALSE | TRUE | FALSE | FALSE | FALSE |
| TRUE | FALSE | FALSE | FALSE | TRUE | FALSE | FALSE | TRUE | TRUE | FALSE |
| FALSE | FALSE | FALSE | TRUE | FALSE | TRUE | FALSE | FALSE | FALSE | FALSE |
| FALSE | FALSE | FALSE | FALSE | FALSE | TRUE | TRUE | TRUE | TRUE | TRUE |
| FALSE | TRUE | FALSE | FALSE | FALSE | TRUE | FALSE | TRUE | TRUE | TRUE |
How many cases are these?
nrow(y5.any.2)
## [1] 682148
In total, we have 682,148 cases. This is, at least one of 10 firms will default in 5 years in 682,148 of 1,000,000 cases.
And how to convert them into probabilities?
atleastone02 <- t(data.frame(nrow(y1.any.2), nrow(y2.any.2),
nrow(y3.any.2), nrow(y4.any.2), nrow(y5.any.2)))
all02 <- t(data.frame(nrow(y1.all.2), nrow(y2.all.2),
nrow(y3.all.2), nrow(y4.all.2), nrow(y5.all.2)))
res02 <- data.frame(all02 / n, atleastone02 / n)
rownames(res02) <- n.year
colnames(res02) <- c("All firms", "At least one")
kable(res02, caption = "Probabilities of default (10 firms, corr=0.2).")
| All firms | At least one | |
|---|---|---|
| year 1 | 0.0e+00 | 0.087276 |
| year 2 | 0.0e+00 | 0.226406 |
| year 3 | 0.0e+00 | 0.386348 |
| year 4 | 1.2e-05 | 0.543162 |
| year 5 | 7.2e-05 | 0.682148 |
Let’s see the difference when we assume a different correlation matrix. This case, the correlation vary randomly between 0.45 and 0.65.
m <- 10 # number of firms
n <- 1000000 # number of simulations
set.seed(130575)
x <- matrix(runif(m * m, 0.45, 0.65), m, m)
ind <- lower.tri(x)
x[ind] <- t(x)[ind]
diag(x) = 1
kable(x, caption = "Correlation between 0.45 and 0.65.", digits = 3)
| 1.000 | 0.622 | 0.597 | 0.622 | 0.530 | 0.517 | 0.518 | 0.490 | 0.455 | 0.578 |
| 0.622 | 1.000 | 0.523 | 0.594 | 0.465 | 0.501 | 0.558 | 0.560 | 0.460 | 0.513 |
| 0.597 | 0.523 | 1.000 | 0.556 | 0.485 | 0.603 | 0.534 | 0.524 | 0.611 | 0.547 |
| 0.622 | 0.594 | 0.556 | 1.000 | 0.564 | 0.555 | 0.496 | 0.500 | 0.502 | 0.513 |
| 0.530 | 0.465 | 0.485 | 0.564 | 1.000 | 0.608 | 0.513 | 0.559 | 0.647 | 0.505 |
| 0.517 | 0.501 | 0.603 | 0.555 | 0.608 | 1.000 | 0.452 | 0.486 | 0.644 | 0.518 |
| 0.518 | 0.558 | 0.534 | 0.496 | 0.513 | 0.452 | 1.000 | 0.577 | 0.635 | 0.466 |
| 0.490 | 0.560 | 0.524 | 0.500 | 0.559 | 0.486 | 0.577 | 1.000 | 0.585 | 0.524 |
| 0.455 | 0.460 | 0.611 | 0.502 | 0.647 | 0.644 | 0.635 | 0.585 | 1.000 | 0.474 |
| 0.578 | 0.513 | 0.547 | 0.513 | 0.505 | 0.518 | 0.466 | 0.524 | 0.474 | 1.000 |
The resulting values of Xr are:
set.seed(130575)
Xr <- mvrnorm(n, mu = rep(0, m), Sigma = x)
Xr <- data.frame(Xr)
kable(head(Xr), caption = "Firms' performance.", digits = 3)
| X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 |
|---|---|---|---|---|---|---|---|---|---|
| -2.107 | -1.206 | -2.033 | -1.730 | -0.760 | -2.207 | -0.933 | -1.141 | -0.296 | -1.308 |
| -1.346 | -2.023 | -1.661 | 0.372 | -0.326 | 0.874 | -0.328 | -0.315 | -0.712 | -0.266 |
| 0.178 | -1.366 | 1.120 | -1.732 | -0.582 | -0.820 | -0.058 | 0.379 | -0.151 | 0.294 |
| 0.641 | 1.168 | 1.146 | 1.028 | -0.116 | 1.488 | 1.292 | 0.430 | -0.186 | 0.500 |
| 1.263 | -0.132 | 1.482 | -0.572 | 0.485 | 0.649 | -0.415 | -1.470 | 0.669 | 0.797 |
| 0.538 | 0.810 | 0.435 | 0.215 | 0.797 | 1.257 | 1.007 | 1.166 | 1.767 | 0.181 |
Extract the cases in which all ten firms default at the same time in 5 years and the cases in which any firm default at the same time in 5 years.
Xr <- as_tibble(Xr)
# All firms.
Xr.y1.all <- filter_all(Xr, all_vars(. < x.y[1]))
Xr.y2.all <- filter_all(Xr, all_vars(. < x.y[2]))
Xr.y3.all <- filter_all(Xr, all_vars(. < x.y[3]))
Xr.y4.all <- filter_all(Xr, all_vars(. < x.y[4]))
Xr.y5.all <- filter_all(Xr, all_vars(. < x.y[5]))
# At least one firm.
Xr.y1.any <- filter_all(Xr, any_vars(. < x.y[1]))
Xr.y2.any <- filter_all(Xr, any_vars(. < x.y[2]))
Xr.y3.any <- filter_all(Xr, any_vars(. < x.y[3]))
Xr.y4.any <- filter_all(Xr, any_vars(. < x.y[4]))
Xr.y5.any <- filter_all(Xr, any_vars(. < x.y[5]))
All firms default in 5,935 cases out of 1,000,000.
nrow(Xr.y5.all)
## [1] 5935
Let’s see the first 6 cases:
kable(head(Xr.y5.all),
caption = "Cases in which all 10 firms default in five years.",
digits = 3)
| X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 |
|---|---|---|---|---|---|---|---|---|---|
| -1.332 | -2.060 | -1.775 | -2.084 | -2.537 | -1.948 | -2.566 | -2.234 | -1.185 | -1.897 |
| -1.935 | -1.393 | -2.217 | -1.527 | -1.609 | -1.794 | -1.680 | -1.235 | -2.578 | -1.188 |
| -1.575 | -1.215 | -1.974 | -1.116 | -1.557 | -2.516 | -1.933 | -2.999 | -2.253 | -2.294 |
| -2.197 | -2.460 | -2.398 | -2.829 | -2.294 | -1.923 | -2.744 | -2.217 | -2.970 | -1.460 |
| -2.201 | -1.222 | -1.372 | -2.246 | -1.607 | -1.540 | -3.018 | -1.733 | -1.201 | -1.597 |
| -2.106 | -2.355 | -2.526 | -2.044 | -2.002 | -2.416 | -1.833 | -2.936 | -2.041 | -2.080 |
Verify that those cases default.
kable((head(Xr.y5.all) < x.y[5]), caption = "Check if all of them default.")
| X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 |
|---|---|---|---|---|---|---|---|---|---|
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
| TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE | TRUE |
At least one firm default in 497,987 out of 1,000,000.
nrow(Xr.y5.any)
## [1] 497987
kable(head(Xr.y5.any), caption = "At least one firm default in five years.",
digits = 3)
| X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 |
|---|---|---|---|---|---|---|---|---|---|
| -2.107 | -1.206 | -2.033 | -1.730 | -0.760 | -2.207 | -0.933 | -1.141 | -0.296 | -1.308 |
| -1.346 | -2.023 | -1.661 | 0.372 | -0.326 | 0.874 | -0.328 | -0.315 | -0.712 | -0.266 |
| 0.178 | -1.366 | 1.120 | -1.732 | -0.582 | -0.820 | -0.058 | 0.379 | -0.151 | 0.294 |
| 1.263 | -0.132 | 1.482 | -0.572 | 0.485 | 0.649 | -0.415 | -1.470 | 0.669 | 0.797 |
| -0.001 | -0.744 | -1.625 | 0.673 | -0.665 | -0.442 | -1.272 | 0.853 | -0.647 | -0.234 |
| -0.180 | -1.380 | -0.383 | -0.292 | 1.604 | 1.021 | 1.007 | 0.079 | 0.390 | -0.426 |
kable((head(Xr.y5.any) < x.y[5]), caption = "Check which one(s) default.")
| X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 |
|---|---|---|---|---|---|---|---|---|---|
| TRUE | TRUE | TRUE | TRUE | FALSE | TRUE | FALSE | TRUE | FALSE | TRUE |
| TRUE | TRUE | TRUE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE |
| FALSE | TRUE | FALSE | TRUE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE |
| FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | TRUE | FALSE | FALSE |
| FALSE | FALSE | TRUE | FALSE | FALSE | FALSE | TRUE | FALSE | FALSE | FALSE |
| FALSE | TRUE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE |
And the corresponding probabilities:
allR <- t(data.frame(nrow(Xr.y1.all), nrow(Xr.y2.all),
nrow(Xr.y3.all), nrow(Xr.y4.all), nrow(Xr.y5.all)))
atleastoneR <- t(data.frame(nrow(Xr.y1.any), nrow(Xr.y2.any),
nrow(Xr.y3.any), nrow(Xr.y4.any), nrow(Xr.y5.any)))
resR <- data.frame(allR / n, atleastoneR / n)
ans <- data.frame(res02, resR)
rownames(ans) <- n.year
colnames(ans) <- c("All (corr=0.2)", "At least one (corr=0.2)",
"All (corr=rand)", "At least one (corr=rand)")
kable(ans, caption = "Probability of default.",
row.names = TRUE)
| All (corr=0.2) | At least one (corr=0.2) | All (corr=rand) | At least one (corr=rand) | |
|---|---|---|---|---|
| year 1 | 0.0e+00 | 0.087276 | 0.000018 | 0.062795 |
| year 2 | 0.0e+00 | 0.226406 | 0.000163 | 0.156515 |
| year 3 | 0.0e+00 | 0.386348 | 0.000825 | 0.267014 |
| year 4 | 1.2e-05 | 0.543162 | 0.002421 | 0.382983 |
| year 5 | 7.2e-05 | 0.682148 | 0.005935 | 0.497987 |
This replicates Hull (2015) example 24.7.
This is the Vasicek model (equation 24.10) in a function form.
CVaR <- function(exp, pd, r, c, l) {
v <- pnorm((qnorm(pd) + (c^0.5) * qnorm(l)) / (1 - c)^0.5)
VaR <- exp * v * (1 - r)
}
See if it works. Let’s evaluate equation 24.10 at 99% and 99.9%. The 1-year 99% and 99.9% credit VaR is:
CVaR.999 <- CVaR(100, 0.02, 0.6, 0.1, 0.999)
CVaR.99 <- CVaR(100, 0.02, 0.6, 0.1, 0.99)
CVaR.999
## [1] 5.129484
CVaR.99
## [1] 3.294271
Let’s evaluate the model at all confidence levels (from 0 to 1) simulating 10,000 values.
set.seed(13)
l <- runif(10000, 0, 1)
Loss <- CVaR(100, 0.02, 0.6, 0.1, l)
sim.var999 <- sort(Loss)[10000 * 0.999]
sim.var99 <- sort(Loss)[10000 * 0.99]
Now, visually:
hist(Loss, 100, xlim = c(0, 7), xlab = "Losses in millions", main = NULL)
abline(v = CVaR.999, lty = 2, col = "red", lwd = 2)
abline(v = CVaR.99, lty = 2, col = "blue", lwd = 2)
legend("topright", legend = c("1-year 99% credit VaR is 3.294271",
"1-year 99.9% credit VaR is 5.129484"),
col = c("blue", "red"), lwd = 2, lty = 2, bg = "white", cex = 0.8)
Figure 4.1: Distribution of losses.
dat <- data.frame(Loss, l)
ggplot(dat, aes(x = Loss, fill = "Losses in millions")) +
geom_density(color = "darkblue", fill = "lightblue") +
geom_vline(aes(xintercept = CVaR.999),
color = "red", linetype = "dashed", size = 1) +
geom_vline(aes(xintercept = CVaR.99 ),
color = "blue", linetype = "dashed", size = 1)
Figure 4.2: Distribution of losses.
Nice.
This document took 67.96 seconds to compile in Rmarkdown, R version 4.2.1 (2022-06-23 ucrt).